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EDITORIAL 


Intentional equity 


ver a decade ago, when I was chief scientist at 
the U.S. National Aeronautics and Space Admin- 
istration, I spoke at a conference called Women 
and Science: Celebrating Achievements, Chart- 
ing Challenges. I lauded women working in as- 
trophysics, government, and science policy in 
the United States and elsewhere, but said that 
progress was mixed—the veneer of success for women 
across the sciences, and in 
science leadership, was too 
thin across the globe. What 
has changed since then? 
Cultural barriers, a lack of 
enlightened policies, and 
the need for role models 
and support systems still 
exist worldwide. However, 
today there is good reason 
to be optimistic. The inter- 
national scientific commu- 
nity is coming together in- 
tentionally to acknowledge 
and tackle gender equity. 
This year, I made my first 
trip as the US. National 
Science Foundation (NSF) 
director to the Next Ein- 
stein Forum (NEF) in Af- 
rica, where I was on a panel 
discussing women in science, 
technology, engineering, and 
mathematics (STEM) fields. 
Scientists, engineers, and 
innovators from across the continent—women and 
men—all were hungry for change and fighting for 
equality. A resulting declaration committed to priori- 
tizing the enrollment of women in STEM programs at 
the tertiary and postgraduate levels in Africa. I also 
was in India for the annual meeting of the Global Re- 
search Council (GRC), where heads of research funding 
agencies took up the issue of gender equity in STEM re- 
search. Even though I was the only woman (until then) 
on the 12-member governing board, everyone agreed to 
a statement of actions that countries could implement 
to further gender equity in STEM fields, including gen- 
der considerations in research design and analysis. 
The GRC board committed to collect and share data to 
chart progress. 
The global economy, too, is now being viewed 
through a gender equity lens. Japan, host country of the 
May 2016 G7 Science and Technology Ministers’ Meet- 


“..global equity for women in 
science...is a call to action...” 


ing of leading industrial nations, is encouraging G7 
nations to lead efforts in “inclusive innovation” to en- 
sure that everyone accesses and benefits from science 
and technology. Further, the final G7 report encourages 
the development of “policy and working environments 
in which equal opportunity allows women to exert 
their abilities [and] advance their career prospects.” 
Such changes help STEM equality and will attract and 
retain talented women in 
STEM careers. 

What about the United 
States?) Women now earn 
about half of all science and 
engineering bachelor’s de- 
grees, yet they account for 
only 30% of the U.S. science 
and engineering workforce. 
In some STEM fields, such 
as mechanical engineering, 
the percentage of women 
is in the single digits. NSF 
will continue to advance 
equity through data-driven 
decision-making. Our Ca- 
reer-Life Balance Initiative, 
for example, mitigates fac- 
tors that can negatively af- 
fect women’s ability to carry 
out research, especially dur- 
ing the early years of their 
careers. NSF’s ADVANCE 
program encourages uni- 
versities to use institu- 
tional data about recruitment and retention to develop 
structural changes to improve representation and ad- 
vancement of women. These deliberate actions by NSF 
complement the research that NSF supports in the sci- 
ence and practice of STEM gender equity. Projects range 
from computer programming camps to encourage girls, 
to studies on creating classroom environments that 
attract and retain female students. One of our newest 
initiatives, NSF INCLUDES, is fostering innovative alli- 
ances and networks that can scale up effective methods 
for addressing shortages and broadening the participa- 
tion of women and others who are underrepresented in 
STEM fields. 

Ensuring global equity for women in science and en- 
gineering research requires personal commitment to 
action within one’s own sphere of influence. It is a call 
to action we all must embrace. 
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-France A. Cérdova 


France A. Cordova 
is director of the 
US. National 
Science Foundation, 
Arlington, VA, USA. 
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44 Let's ask ourselves why haven't we beaten this 
epidemic. Could it be because we don't want to? 99 


South African—born actress and AIDS activist Charlize Theron, in a speech last 


IN BRIEF 


week at the International AIDS Conference in Durban. Theron suggested that society 
values some lives more than others, contributing to the virus’s spread. 


The Norway rat 
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New Zealand’s goal: Rat-free by 2050 


n isolated archipelago, New Zealand once hosted 
almost 200 species of native birds, many of them 
flightless like the iconic kiwi. But introduced 
species—rats, possums, and a weasellike carnivore 
called a stoat—kill about 25 million of these native 
birds every year. This week, the country’s prime 
minister, John Key, announced a $20 million commit- 
ment of seed money to set up Predator Free New Zealand 
Ltd., a company that would lead the charge in getting rid 
of all of these predators by 2050. Until now, the country’s 
eradication efforts have focused on small islands; those 


AROUND THE WORLD 
Public wary of tech enhancements 


WASHINGTON, D.c. | Anew poll finds 
that Americans have serious questions 
about the use of new technologies to 
enhance the lives of healthy people. The 
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Pew Research Center in Washington, D.C., 
asked 4726 U.S. adults for their views 

on editing genes in utero to reduce the 
prevalence of serious diseases, implant- 
ing brain chips into healthy individuals to 
augment their mental skills, and infusing 
fit people with synthetic blood to increase 


efforts boast a 90% success rate, says James Russell, a 
conservation biologist at the University of Auckland in 
New Zealand. The new effort, Russell says, is “the modern 
equivalent to landing someone on Mars,” requiring new 
technologies and billions of dollars to succeed. But he is 
optimistic because local communities and organizations 
are on board. However, staying rat-free after 2050 might 
be the challenge, says Alberta Agriculture and Forestry 
conservation biologist Phil Merrill in Lethbridge. (That 
Canadian province is rat-free.) “They can do it if they can 
prevent the rats from jumping off the boats,” he predicts. 


performance. The results, announced this 
week, show that Americans are much 
more comfortable with the idea of using 
those technologies to help correct existing 
problems or cope with a disability than 

to achieve a higher level of functioning. 
Those with strong religious beliefs are 
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especially troubled by the prospect of such 
enhancements, seeing them as meddling 
with nature, and women were significantly 
more wary than men. “The strength of 
differences based on religious commit- 
ment is one of the most surprising results,” 
says Cary Funk, an associate director of 
research at Pew. “People were able to grasp 
the nuances of these different technolo- 
gies,” she added, “even though none has 
been put into practice.” 
http://bit.ly/techenhance 


Accord nearer on HFCs 


VIENNA | Nations have moved a step 
closer to a global agreement to curb the 
use of hydrofluorocarbons (HFCs), potent 
climate-warming compounds used in 
aerosols and for cooling. After days of 
meetings in Vienna, on 24 July negotiators 
from more than 150 nations reported 
substantial progress toward a deal that 
calls for richer nations to essentially end 
the use of HFCs by the 2030s, with poorer 
nations getting additional time to meet 
that goal. But negotiators must still re- 
solve exact timetables, and how to fund the 
development and deployment of sub- 
stitute technologies. Nations hope to sign 
a final deal in Kigali in October. 
Researchers estimate that phasing 

out HFCs, which became a widely 

used replacement for ozone-eating 
chlorofluorocarbons, could prevent as 
much as 0.5°C of planetary warming. 


Gillnet ban aims to save vaquita 


MEXIco ciTy | Fishing with gillnets 

will soon be permanently banned in the 
northernmost part of the Gulf of California, 
Mexico’s national fisheries commission 
announced on 19 July. The ban, which will 
begin in September, is intended to protect 
the vaquita, a small porpoise that lives only 
in those waters. The vaquita is listed as 
critically endangered, with just 60 individu- 
als remaining; the mammals often die after 


A vaquita caught in fishing nets. 
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The cactus collection at the Tel Aviv University Botanical Garden in Israel. 


Israeli botanical gardens face budget cuts 


srael’s 11 botanical gardens are scrambling to cope with deep cuts in funding 

from the government's agricultural ministry. Spending on the gardens, which host 

research and education programs and are often associated with universities, is 

down by more than 50% this year, from about 4.5 million shekels ($1.1 million) 

in 2014 to 2 million shekels this year. That's a reprieve from a 98% cut that the 
ministry announced last year, but still a major blow for the gardens, which rely heavily 
on ministry funds to pay for basic operations. “There were times this year when we 
couldn't afford potting soil, or even printer paper,’ says Tal Levanony, curator of the 
garden associated with Tel Aviv University. http://bit.ly/Israeshekellgardens 


becoming tangled in the vertical “walls” of 
netting. Fishing at night, when the rules are 
hardest to enforce, will also be prohibited, 
and fishing boats will have to depart from 
and return to specially designated docks. 
The ban focuses on the legal shrimp fishery, 
but gillnets are also used to illegally catch 
totoaba, an endangered fish whose swim 
bladder sells for $60,000 per kilogram in 
China. Enforcing the ban on new nets will 
be key to saving the vaquita, but “remov- 
ing abandoned ‘ghost’ gillnets from vaquita 
habitat is the most immediate and urgent 
priority,’ says Omar Vidal, CEO of World 
Wildlife Fund Mexico in Mexico City. 


‘First-in-human’ trials overhauled 


LONDON | The European Union is beefing up 
protections for volunteers in phase I clinical 
trials in the wake of a disastrous clinical 
study in Rennes, France, that resulted in the 
death of one volunteer and the hospitaliza- 
tion of five others (Science, 12 February, 

p. 642). On 21 July, the European Medicines 
Agency (EMA) in London announced, in a 
“concept paper” written by an international 


Published by AAAS 


group of experts, that it wants to improve 
strategies to identify and reduce risks in 
“first-in-human” (FIH) studies. The cur- 
rent guideline for FIH studies dates from 
2007. EMA will seek in particular to reduce 
the risks of studies that combine different 
subtrials, which are becoming increasingly 
common; the trial in Rennes included 
subtrials using single and multiple dose 
administration, as well as trials on drug 
interactions with food and on pharmaco- 
dynamics, the study of a drug’s biochemi- 
cal and physiologic effects on the body. 
This increased complexity requires a new, 
structured approach, EMA says, in which 
decisions on each new step need to be based 
on the data collected at the previous one. 
EMA invites comments from stakeholders 
until 30 September, after which it will pub- 
lish a draft revised guideline, probably later 
this year. http://bit.ly/_EMArules 


A pan-European pension fund 
MANCHESTER, U.K. | The European Union 
makes it easy for researchers to hop 
between member countries for research 
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and training opportunities. Now, it is 
easing a practical concern linked to this 
mobility: securing a full retirement pen- 
sion. Most European countries offer social 
security, but universities and institutions 
often provide “supplemental” pensions, 
which mobile researchers could lose out 
on. A pan-European pension plan called 
RESAVER that would allow researchers 
to carry supplemental pension benefits 
with them has been in the making since 
October 2014, and it is now up and 
running, organizers explained on 24 

July at the EuroScience Open Forum in 
Manchester. Only three organizations, 
including a private university in Hungary 
and a research center in Italy, are ready 
to pay into the fund. But another 100 may 
join in the near future. In some countries, 
however, social and legal factors factors 
could limit the take-up of the plan. 
http://bit.ly/RESAVER 


Satellite shuffle for South Pole 


SOUTH POLE | The National Science 
Foundation (NSF) last month 
decommissioned the oldest continuously 
operating satellite in the sky, the 
38-year-old Geostationary Operational 
Environmental Satellite (GOES-3), 
because of its shrinking fuel supply. 
GOES-3 was originally a weather 
satellite, but for the past 21 years it has 
provided phone and internet service 

to NSF’s Amundsen-Scott South Pole 
Station. Most communications satellites 
orbit at or near the equator, but GOES- 
3’s orbit included a southerly oscillation 
that provided about a 6-hour window 

of visibility to the South Pole each day. 
(Several satellites provide visibility 
windows to the South Pole; with GOES-3, 
the total window was about 10 to 

11 hours each day.) The decommissioning 
of GOES-3, in which the satellite was 
transferred to a “graveyard orbit” far 
from operating geostationary satellites 


An artist's rendering of the GOES-3 satellite, which 
was moved into “graveyard orbit” last month. 
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“Poop sticks” reveal 
the spread of infectious 
disease along the 
ancient route. 


and then shut down, was completed 

on 29 June. “Communication from 
Antarctica is hard by today’s standards 
when you can pick up an iPhone and 
talk to someone in microseconds,” 

says Pat Smith, program manager, 
technology development for NSF’s U.S. 
Antarctic Program. The Defense Satellite 
Communications System phase III, 
vehicle B7 satellite is replacing GOES-3; 
it is more powerful and sophisticated, 
providing a higher data rate. It will also 
increase the daily visibility window for 
the South Pole to 14 to 15 hours, 

Smith says. 


World’s deepest blue hole 


SANSHA, PARACEL ISLANDS | Off the coast of 
the Paracel Islands in the South China 
Sea lies a dark blue, underwater cavern 
known as the “Dragon Hole.” It’s long 
been the stuff of local legend. Now, a 
team of Chinese researchers says it is 
officially the world’s deepest known 
blue hole, plunging to a depth of about 
300 meters—about 100 meters deeper 
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than the previous recordholder, Dean’s 
Blue Hole off the coast of the Bahamas. 
Blue holes occur on shallow carbonate 
platforms, such as the Bahama Banks, 
the Yucatan Peninsula, or the Paracel 
Islands; they are thought to have formed 
during past ice ages, when sea level was 
much lower and chemical weathering 
acted on the then-exposed, limestone- 
rich rock. The researchers began their 
exploration of the blue hole last August 
with the aid of an underwater robot, 
sonar scanners, deep-sea current meters, 
and underwater cameras. The team told 
a Chinese news agency last week that 
they had found more than 20 species of 
fish and marine life near the surface of 
the blue hole; below about 100 meters, 
however, the water contains too little 
oxygen to support life, they said. On 

24 July, Sansha renamed the cavern the 
Sansha Yongle Blue Hole, and said it has 
plans to continue to study it. The Paracel 
Islands are among several disputed ter- 
ritories in the South China Sea; China 
and Vietnam have both laid claim to the 
island chain. 
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FINDINGS 


New spin on an old ball game 


In cricket, there are many ways that a 
bowler can outwit a batsman, but perhaps 
the one shrouded in the most mystery is 
spin bowling. Delivered at relatively low 
speed, like the knuckleball in baseball, 

the ball is given such extreme amounts of 
spin that it swerves while in the air and 
changes direction on the bounce. A pair of 
researchers has now shown that subtle spin 
combinations can produce the surprising 
and potentially game-changing results. In a 
paper in Physica Scripta, Ian Robinson of 
Victoria University in Melbourne, Australia, 
and Garry Robinson of the University of 
New South Wales, Canberra, describe a 
mathematical analysis of cricket balls in 
flight under the influence of three different 
directions of spin, plus gravity, drag, and 
lift. Topspin is known to make a ball bounce 
earlier; however, the researchers show that, 
if the total amount of spin is kept constant, a 
small amount of topspin added to a side- 
spinning ball makes it land 25 centimeters 
earlier, and a touch of sidespin added to a 
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Sickness on the Silk Road 


ravelers along the Silk Road 
toted a lot more than silk and 
spices. Fossilized intestinal 
parasites found in 2000-year- 
old human excrement from 
western China are offering the first 
evidence of the spread of infectious 
diseases along the Silk Road. In 1992, 
scientists working at the eastern edge 
of the Taklamakan Desert excavated 
alatrine at a Silk Road relay station 
where travelers likely slept and ate. 
The researchers found a number 
of “hygiene sticks,’ bamboo sticks 
with strips of cloth that were used to 
wipe the nether regions; the sticks 
were sent to a museum and forgot- 
ten about for decades. But last year, 
researchers at the University of 
Cambridge in the United Kingdom 
obtained the sticks and examined the 
ancient feces. They discovered eggs 
from four different parasites, includ- 
ing the Chinese liver fluke—a flatworm 
endemic to marshy areas. That para- 
site, the researchers suggested last 
week in the Journal of Archaeological 
Science: Reports, must have origi- 
nally been picked up by travelers in 
modern-day Guangdong province, 
some 2000 kilometers away. 


topspin delivery can produce 10 centimeters 
of sideways drift. The results, they say, 
could help cricket fans understand the 
underappreciated art of spin bowling. 


Dark matter hunt comes up empty 


The latest, most sensitive search for particles 
of dark matter—the invisible stuff whose 
gravity appears to bind the galaxies—has 
come up empty. The leading candidates 


LUX’s photomultiplier tubes watched in vain. 
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The United States’s energy efficiency 
rank in 2016, up from 13 in 2014, 
according to the nonprofit American 
Council for an Energy-Efficient 
Economy. Germany is in first place. 


for dark matter are so-called weakly 
interacting massive particles, or WIMPs. 
Since 2012, physicists working with the 
Large Underground Xenon (LUX) detector 
have searched for WIMPs bumping into 

the atomic nuclei in 370 kilograms of 

frigid liquid xenon. But the experiment, 
housed 1480 meters deep at the Sanford 
Underground Research Facility in Lead, 
South Dakota, ended its final run in May, 
and researchers found no evidence for 

such particles, they reported last week at 

a conference in Sheffield, U.K. The WIMP 
hunt will continue with newer detectors that 
will be even more sensitive: XENONIT, a 
detector at Italy’s subterranean Gran Sasso 
National Laboratory that contains 3.5 metric 
tons of liquid xenon, and a rebuild of the 
LUX detector called LZ that would contain 
10 metric tons of xenon and come online in 
2020. But physicists’ enthusiasm for WIMPs 
may be cooling—not just because they 
haven’t found them yet, but also because 
the world’s biggest atom smasher, Europe’s 
Large Hadron Collider, has yet to blast such 
particles into existence, as theory suggests 
it should. 
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INFECTIOUS DISEASE 


Obstacles loom along path to the end of AIDS 


International meeting highlights clash between ambitious goals and wobbly funding 


By Jon Cohen, in Durban, South Africa 


he ambitious goal of “ending AIDS” is 
colliding with sobering realities. Re- 
searchers are convinced that if enough 
infected people can be put on anti- 
retroviral (ARV) drugs, new infection 

rates will fall and AIDS will become a 

thing of the past. In 2014, the Joint United 
Nations Programme on HIV/AIDS (UNAIDS) 
embraced the goal, setting a 2030 target for 
identifying and treating everyone living with 
HIV. That would mean expanding the num- 
ber of people on ARVs from 17 million today 
to nearly 37 million. But at the International 
AIDS Conference here last week, research- 
ers and advocates highlighted disheartening 
trends: Funding is tight, infection rates re- 
main high, drugs aren’t getting cheaper, and 
infected people may not take the pills they 
are offered. “I’m scared,” said UNAIDS head 
Michel Sidibé during the opening ceremony. 
Last year, UNAIDS estimates that the 
world spent $19 billion on the HIV/AIDS 
response. It’s a huge increase from the less 
than $2 billion spent in 2000, when the inter- 
national AIDS meeting last took place here. 
But to “end AIDS,’ the agency calculates that 
funding must ramp up to $26.2 billion a year 
by 2020. And support is flat or declining. “If 
we continue with this trend, we’ll not be able 
to end AIDS by 2030,” Sidibé said. “We will 
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have a rebound in the epidemic.” 

A new report from UNAIDS and the Kai- 
ser Family Foundation shows that for the 
first time in 5 years, HIV/AIDS assistance 
from wealthy donor countries declined in 
2015. Poorer countries have also cut domes- 
tic spending on treatment and prevention. 
And the two biggest purchasers of ARVs are 
deeply concerned about their budgets. 

As part of its “replenishment” drive, the 
Global Fund to Fight AIDS, Tuberculosis and 
Malaria is asking donor countries to contrib- 
ute $13 billion to cover grants to countries 
over the next 3 years—up from $12 billion 
raised at the last replenishment. “There’s a 
huge uncertainty,’ given the political flux in 
the United States, the United Kingdom, and 
other leading European donor countries, says 
Michel Kazatchkine, a former Global Fund 
head who now is a special AIDS envoy for the 
United Nations secretary-general in Geneva, 
Switzerland. The U.S. President’s Emergency 
Plan for AIDS Relief (PEPFAR) in Washing- 
ton, D.C., which invests $4.7 billion a year 
on bilateral aid, has had a flat budget since 
2009 and has been able to expand the num- 
ber of people on treatment primarily because 
drug costs have declined. “Flat is the new 
increase,” said PEPFAR’s director of financial 
stability, Michael Ruffner. 

South Africa feels the funding pinch 
acutely. The country has the world’s larg- 
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est epidemic and buys more ARVs than any 
other: Out of an estimated 6.6 million people 
infected, 3.4 million are on treatment. It now 
pays for 80% of its HIV/AIDS response, and 
the Global Fund and PEPFAR are cutting 
back support. But the country has had a de- 
clining gross domestic product for the last 
5 years (Science, 1 July, p. 18), along with a 
currency devaluation that made drugs from 
foreign companies more expensive. 
Meanwhile, the number of people around 
the world who become infected with HIV 
each year, nearly 2 million, has remained 
stubbornly stable for 5 years. In Eastern Eu- 
rope and central Asia, the region with the 
fastest growing epidemic in the world, new 
infections jumped 57% between 2000 and 
2015. Russia no longer receives Global Fund 
assistance because it’s too wealthy, and sup- 
port for Ukraine is also dwindling. Many of 
these countries have haphazard HIV/AIDS 
responses. For instance, only 18% of the HIV- 
infected people in Eastern European and cen- 
tral Asian countries currently receive ARVs. 
Cheaper drugs could stretch funds and 
make the 2030 goal more attainable. Indian 
manufacturers of generic pills provide 76% of 
all the drugs used in low- and middle-income 
countries, charging as little as $100 per pa- 
tient per year. But Anil Soni of generic drug 
titan Mylan Pharmaceuticals in Canonsburg, 
Pennsylvania, which has a large operation in 
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Street marches at the International AIDS Conference 
held in Durban, South Africa, last week rallied countries 
to offer treatment to all HIV-infected people. 


Bengaluru, India, contends that generic drug 
prices for first-line treatment are already at 
rock bottom. He has warned the big buyers 
that companies will lose money if they cut 
prices much lower, and that they need more 
stable purchasing commitments to invest 
in the plants that will be required to more 
than double their production. “I don’t think 
it’s impressed upon anyone that treating 
30 million people over 100 countries with 
15,000 metric tons of drug has never been 
done,” said Soni, who in earlier jobs sat on 
the other side of the table, negotiating lower 
prices when he worked at the Global Fund 
and the Clinton Health Access Initiative. 

Peter Piot, the former director of UN- 
AIDS who now heads the London School of 
Hygiene & Tropical Medicine, says he spent 
much of his career urging drug companies 
to lower prices, but he now agrees with 
Soni that they’ve hit the bottom. He talks 
of a “looming crisis of insufficient supply 
of ARVs” if generic companies decide that 
the profit margin is too small and that it’s 
better business to produce Viagra or high 
blood pressure drugs. 

UNAIDS’s “ending AIDS” goal is also col- 
liding with human behavior. Its cornerstone 
is what’s known as 90-90-90: Ninety percent 
of HIV-infected people know their status, 
90% of that group receives care, and 90% of 
people on treatment suppress the level of vi- 
rus in their blood to undetectable levels. This 
both helps individuals and reduces the like- 
lihood that they will transmit the infec- 
tion, as a landmark study published in 
2011 proved convincingly. But in the United 
States, only 30% of HIV-infected people fully 
suppress their virus, and the number is far 
lower in many countries. 

A new study of 28,000 people in South 
Africa underscores the problem. The so- 
called Treatment as Prevention (TasP) 
study compared communities that started 
ARVs as soon as people learned they were 
infected with those that began treatment 
later, as per older government recommen- 
dations. The two cohorts had the same 
rate of new infections, probably because 
53% of the people in the immediate treat- 
ment arm who were eligible for the drugs 
never followed up with a clinic visit within 
12 months. Even among the 47% who did, not 
all opted to start treatment. 

The TasP results emphasize that offering 
ARVs to every infected person is just one 
step forward and, absent intensive efforts to 
get people on treatment and help them stick 
with it, has little chance of bringing the epi- 
demic to an end. 
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ANIMAL RESEARCH 


Chimpanzee sanctuaries 
open door to more research 


Collaboration aims to beef up science at retirement centers 


By David Grimm, in Keithville, Louisiana 


he lab chimp is on the verge of extinc- 

tion. Fewer than 700 remain in U.S. 

laboratories, and most are expected 

to move to sanctuaries over the next 

decade because of ethical and scien- 

tific concerns (Science, 19 June 2015, 
p. 1296). But a new opportunity may be 
opening up for studies of chimpanzee be- 
havior and cognition: A first-of-its-kind 
partnership between a sanctuary and a 
research center, announced this month, 
is designed to bolster the scientific out- 
put of facilities that have until now pri- 
marily focused on the long-term care of 
their animals. 

Proponents hope the agreement, between 
the Chimp Haven sanctuary here and a 
research arm of the Lincoln Park Zoo in 
Chicago, Illinois, will become a model for 
others. “I hope that this collaboration is 
the first step that sets a trend,” says Gregg 
Tully, executive director of the Pan African 
Sanctuary Alliance, a network of 22 African 


Where the chimps are 


Chimpanzees are disappearing 


13 Entertainment 


whether important, quality studies can be 
done at facilities that allow little or no inter- 
action between researchers and animals. 
The agreement between Chimp Haven and 
the zoo’s Lester E. Fisher Center for the 
Study and Conservation of Apes is “a good 
first step,’ says Michael Beran, a psycho- 
logist at Georgia State University in At- 
lanta who has worked with lab chimps 
for more than 20 years. “But it’s going to 
take a lot more work to see if this model 
establishes itself.” 

African sanctuaries, which mostly draw 
orphans from the bushmeat and pet trades 
rather than from labs, have been more 
amenable to research, notes Stephen Ross, 
director of the Fisher Center, who facili- 
tated the new agreement. But projects only 
happen when a visiting scientist like Hare 
comes along, he says. “When the researcher 
goes away, the science goes away.” 

He wanted to build a stronger bridge be- 
tween the sanctuary and scientific worlds. 
In addition to his zoo duties, he chairs the 
board of directors of Chimp Haven, which 
has more than 200 chimpan- 
zees and is the designated 
retirement center for all fed- 
erally owned chimps. Ross 
believed a partnership be- 
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limited behavioral research, 
and the Fisher Center, where 
several Ph.D.s study chimp 
memory and tool use, could 
bolster Chimp Haven’s re- 
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sanctuaries based in Portland, Oregon. 

But such partnerships won’t be easy, be- 
cause sanctuaries, particularly in the United 
States, have long had a fraught relationship 
with the research community. “They see 
science as a threat,” says Brian Hare, an 
evolutionary anthropologist at Duke Uni- 
versity in Durham, North Carolina, who 
has studied chimpanzee cognition in Af- 
rican sanctuaries for more than a decade. 
Scientists, meanwhile, have questioned 
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from the Arcus Foundation, 
which supports great ape 

conservation, launched the collaboration. 
It also allowed the group to hire its first 
researcher, postdoc Bethany Hansen. Today, 
she’s jotting notes on her iPad mini as she 
watches seven chimps in “The Courtyard’— 
a two-story caged enclosure about the size 
of a tennis court filled with hammocks, tire 
swings, and climbing platforms. She’s track- 
ing each chimp for about 10 minutes, and 
noting every 30 seconds what they’re do- 
ing and who they’re with. The goal is to see 
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how the presence of strangers affects their 
behavior, findings that may influence how 
much access sanctuaries allow the public. 

Hansen is also studying how the chimps 
adapt to the sanctuary’s environments, in- 
cluding corrals and a 2-hectare forest where 
the animals can climb trees and poke sticks 
into artificial termite mounds. At Chimp 
Haven, she has access to 10 times the num- 
ber of chimpanzees she would at the zoo, 
and, unlike in the wild, there’s no danger 
of losing track of an animal. “You're getting 
the best of both worlds,” Ross says. 

In return, the sanctuary gets a full-time 
scientist. Research director Amy Fultz says 
her team is lucky to collect 6 hours of data 
on the chimpanzees per week. Hansen can 
do 3 hours a day and tackle a variety of proj- 
ects. “It enhances what we can do, and it 
enhances our mission,” Fultz says. 

Some biomedical studies may even be 
possible. Chimp Haven’s president, Cathy 
Spraetz, says the sanctuary would consider 
sharing blood and other tissues collected 
during routine procedures with outside 
scientists. It has also agreed to donate the 
brains of deceased animals. 

Still, sanctuary research faces big chal- 
lenges. Most animals at Chimp Haven have 
spent their entire lives in labs, where some 
were injected with viruses like hepatitis 
and HIV and regularly had organs biop- 
sied and blood drawn. That could compli- 
cate research on how normal chimps think 
and behave, says David Watts, a Yale Uni- 
versity primatologist who has_ studied 
chimpanzees in Uganda for more than 
20 years. “We see some puzzling findings 
with these [captive] animals in cognition 
research,” he says, like chimps asking hu- 
mans to do things for them—which would 
never happen in the wild. 


But the biggest hurdles are cultural. 
Ross would like to eventually move on to 
more substantive studies of behavior and 
cognition at the sanctuary. That could in- 
clude giving the animals touchscreens and 
puzzles to play with. Spraetz is open to such 
experiments, as long as they don’t interfere 
with the animals’ normal lives. 

But Molly Polidoroff, who runs Save the 
Chimps—North America’s largest chim- 
panzee sanctuary, based in Fort Pierce, 
Florida—is less comfortable with such work. 
“We don’t test hypotheses with our chimps.” 

Given that resistance, Georgia State’s 
Beran questions whether sanctuaries can do 
weighty science. “These facilities could have 
a very large impact on the field, but only if 
they can find a way to do this work without 
conflicting with their retirement mission,” 
he says. “It would be a real shame if in 
10 years the only research that was being 
done was, ‘What diet should they be on?” 

And if Chimp Haven truly wants to beef 
up its research program, it will need to find 
more money. The National Institutes of 
Health owns most of the chimpanzees here 
and pays for their care, but it doesn’t fund 
research on them. So the collaboration will 
have to expand its reliance on donors and 
private foundations. Ross also hopes that 
scientists who have lost their lab chimps 
will come to sanctuaries to continue their 
work—and bring their own money. 

For now, sanctuaries seem open to the 
idea of conducting more research. Even 
Save the Chimps’s Polidoroff has just hired 
a nationally known scientist to head up a 
formal behavioral research program. “We 
see the potential for doing more,’ she says. 
Hare is optimistic, too. “I think research on 
great apes will flourish,” he says. “There’s a 
happy page being turned here.” 


a 


Bethany Hansen observes retired chimps in an outdoor play area at Chimp Haven in Keithville, Louisiana. 
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CONSERVATION BIOLOGY 


Rethinking 
the North 
American wolf 


Genome sequencing suggests 
two endangered wolf species 
are coyote hybrids 


By Virginia Morell 


hen is a wolf a wolf? For more 

than 30 years, the question has 

dogged scientists, conservation- 

ists, and policymakers attempting 

to restore and protect the large 

wild canids that once roamed 
North America. Now, a study of the com- 
plete genomes of 28 canids reveals that de- 
spite differences in body size and behavior, 
North American gray wolves and coyotes 
are far more closely related than previ- 
ously believed, and only recently split into 
two lineages. Furthermore, the endangered 
red and eastern wolves are not unique lin- 
eages with distinct evolutionary histories, 
but relatively recent hybrids of gray wolves 
and coyotes, the scientists report online this 
week in Science Advances. 

That could be a problem for the wolves. 
The red wolf is currently protected under the 
U.S. Endangered Species Act (ESA), and some 
conservationists would like to see the eastern 
wolf listed as well. (It is protected in Canada.) 
But as hybrids, they may not qualify for pro- 
tection under U.S. law. The study “helps with 
more data but hurts by giving less protection 
to [the] two wolf types,’ says Doug Smith, the 
leader of Yellowstone National Park’s wolf 
restoration project in Mammoth, Wyoming. 

The research team argues that red and 
eastern wolves should still be protected, 
and urges reconsideration of our black-and- 
white species concept. “People think that 
species should be genetically pure, that there 
should be tidy categories for ‘wolf’ and ‘coy- 
ote’ That’s not what we found,’ says Bridgett 
vonHoldt, an evolutionary biologist at Prince- 
ton University and the study’s lead author. 
“The study shows that mixed ancestry is 
common, even in animals [in the western 
United States] we’ve traditionally identified 
as ‘pure,” adds Linda Rutledge, a postdoc in 
VonHoldt’s lab who was not involved in the 
study and doesn’t accept all its findings. “It 
shows how outdated the endangered species 
policy is with respect to hybrids.” 
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Gray wolves (Canis lupus) and the smaller, 
narrow-snouted coyotes (C. latrans) have 
long been accepted as North America’s two 
large canid species. But some scientists rec- 
ognize two additional wolf species—the red 
(C. rufus), found in the southeastern United 
States, and the eastern wolf (C. lycaon), 
which ranges from the Great Lakes into east- 
ern Canada (see map, below). 

The United States Fish and Wildlife Ser- 
vice (FWS) counts both as 


very close relatives.” Even western wolves 
that do not breed with coyotes still share 
some coyote genes. 

But the team found even more coyote 
genes, of more recent origin, in red wolves 
and eastern wolves, including those from 
Algonquin Provincial Park in Canada where 
pure eastern wolves were thought to exist. 
The paper estimates that Algonquin wolves 
have about 32% coyote ancestry, and Que- 


ary biologist at the University of California, 
Los Angeles. 

Rutledge and others, including conserva- 
tion geneticist Paul Wilson, who studies the 
eastern wolf at Trent University in Peterbor- 
ough, Canada, argue that researchers need to 
sequence more samples of C. lycaon before 
dumping that taxon. But others who have 
long questioned the status of eastern and red 
wolves welcome the work. “Wolf biologists 

and others have been wait- 


species. It put the red wolf on 
the endangered list in 1973 
and started a captive breed- 
ing program for it in 1980, 
but reintroducing the ani- 
mals has proven difficult, be- 
cause they readily mate with 
coyotes. The agency has not 
put the eastern wolf on the 
endangered list, although it is 
restricted to a small portion 
of its former range. (In a con- 
troversial move in 2012, FWS 
used the existence and range 
of the eastern wolf as a tech- 
nicality that could invalidate 
the gray wolf’s protections, 
because if the eastern wolf is 
a real species, then the gray’s 
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Opinions vary on wolf ranges and identities, but most researchers agree that the gray 
wolf once roamed across much of North America (including into Mexico, not shown) 
and that the coyote ranged across the west. A new genetic study finds that the red wolf 
and the eastern wolf (one from Quebec in Canada, bottom) arose later by mixing with 
coyotes as they expanded eastward. 
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ing for this sort of definitive 
analysis for years,” says Susan 
Haig, a wildlife ecologist at 
the United States Geological 
Survey in Corvallis, Oregon. 
The loss of species status 
for the red and eastern wolves 
doesn’t mean they should lose 
protection, Wayne and others 
say. Hybridization is “a natu- 
ral and commonly occuring 
evolutionary event, Wayne 
says, noting that the ESA has 
successfully been used to pro- 
tect hybrid species such as the 
Florida puma and western 
spotted owl. He thinks eastern 
and red wolves should be pro- 
tected because they likely still 
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“species” are, in fact, wolf- 
coyote hybrids that arose after 
the grays were hunted almost 
to extinction. To help sort out 
the North American wolves’ 
muddled history, VonHoldt’s 
team sequenced the whole 
genomes—nearly 3_ billion 
bases each—of 28 large canids; 
they included wolves from 
Asia, Mexico, Canada, and 
the United States, plus coy- 
otes, domesticated dogs, and 
a golden jackal. Comparing 
the genomes let them “look 
back in time at the canids’ 
deep evolutionary history,” 
VonHoldt explains, “and to 
find each species’s closest rela- 
tive, and when they diverged.” 

Using a molecular clock based on differ- 
ences in the genomes to calculate when coy- 
otes and gray wolves split, the team got a 
surprise: These canids separated from the 
Eurasian wolf and into two distinct lineages 
between 6000 and 117,000 years ago. Other 
researchers had previously dated this event 
to 1 million years ago using the fossil record. 
The recent date for the wolf-coyote split “is 
phenomenal,” VonHoldt says. “They are 
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bec wolves more than 50%. The team con- 
cludes that neither the red nor the eastern 
wolf is a species. Instead, they suggest that 
both are hybrid populations that arose after 
Europeans arrived in North America, when 
gray wolves that managed to survive hunt- 
ing and habitat loss mixed with expanding 
populations of coyotes. “There’s nothing in 
their genome that’s not gray wolf or coyote,” 
says co-author Robert Wayne, an evolution- 
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to today’s human-dominated 
landscapes. The team also ar- 
gues that the agency’s argu- 
ments for delisting the gray 
wolf are no longer valid. 

It’s possible that eastern 
and red wolves—if regarded 
as grays—would still be pro- 
tected, but FWS declined to 
comment on the details of 
the paper. 

Other scientists say the 
messy natural biology re- 
vealed by the study clashes 
with society’s need for clear le- 
gal definitions. “It’s beautiful 
work and topflight science,’ 
says Mike Phillips, a restora- 
tion ecologist with the Turner 
Endangered Species Fund in 
Bozeman, Montana. “But from a practical 
standpoint, to do what they’re asking [and 
consider the ecological benefits of hybrids], 
you'd have to amend the ESA.” 

He and others lament the possibility 
that red wolves might lose ESA protection 
because of the findings. That, they say, 
would be a sad irony for canids that likely 
evolved because of human disturbance in 
the first place. @ 
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After failed coup, Turkey's 
academics feel regime's wrath 


Thousands of researchers and university officials fired, 
suspended, or told to return home immediately 


By John Bohannon 


ast week’s failed coup attempt in Turkey 

lasted just a day, but the pain has only be- 

gun for Turkish academics. In a massive 

political purge, the government has sus- 

pended or fired thousands of professors 

and staff at universities, as well as em- 
ployees of the nation’s ministry of education. 
Officials have also ordered researchers who 
are affiliated with Turkish universities and 
working abroad to return homeimmediately— 
with an implied threat of treason charges for 
those who don’t. 

As a result of the moves, even Turkey's re- 
searchers who are still employed “have just 
lost this entire field and conference season,’ 
says Cagan Sekercioglu, an ecologist based at 
the University of Utah in Salt Lake City who, 
like other Turks who permanently left their 
country for academia abroad, is beyond the 
government’s reach. 

The 15 July coup attempt was short-lived. 
Rebel soldiers briefly seized key buildings, 
bridges, and roads, but Turkish President 
Recep Tayyip Erdogan quickly reestablished 
his grip on power. Erdogan then made an 
ominous announcement about his oppo- 
nents: “They will pay a heavy price.” Exactly 
who “they” are remains to be seen. 

Many of the protest movements against 
Erdogan over the years have originated 
among academics, which may be why they 
are one focus of his wrath. In the span of a 
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few days, Erdogan fired more than 45,000 
civil servants in the military and judiciary, 
and 15,000 staff members of the ministry of 
education. Some 21,000 teachers lost their li- 
censes, and more than 1500 university deans 
have reportedly resigned under pressure. 
Since then, officials have sacked hundreds 
more faculty and staff at some universities. 
It’s not clear whether or when those removed 
from jobs might get them back. Universities 
have ground to a halt, say sources based in 
Turkey. On 21 July, Erdogan tightened his 
grip by declaring a 3-month state of emer- 
gency, which allows him to set curfews, issue 
decrees, and make arrests without warrants. 
The global scientific community is looking 
on with dismay. “[We are] alarmed by the re- 
pressive and excessive nature of recent mea- 
sures against several public sectors in Turkey, 
including the academic and research commu- 
nity,’ reads an open letter published last week 
by the European Federation of Academies of 
Sciences and Humanities. The purge is not a 
proportional reaction, says Martin Chalfie, 
chair of the Committee on Human Rights at 
the U.S. National Academies of Sciences, En- 
gineering, and Medicine in Washington, D.C. 
“Protecting national security should not be 
incompatible with safeguarding fundamental 
rule of law and human rights principles.” 
Turkish law does give Erdogan broad 
authority over Turkey's 180 universities. 
Although each campus chooses its rector 
through a nominally democratic process in 
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Supporters of Turkey’s government celebrate 
the failed coup on a bridge in Istanbul. 


which faculty members vote for candidates, 
the ministry of education has the final say. 
“The university rarely gets its top choice,’ 
says Caghan Kizil, a Turkish molecular bio- 
logist based at the Dresden University of Tech- 
nology in Germany. As a result, Erdogan’s al- 
lies occupy key slots, as the aftermath of the 
coup has made clear. “The university rectors 
asked their deans to resign and the implica- 
tion was clear: Resign or you will be accused 
of treason and arrested,’ Kizil says. 

For some Turkish academics, the govern- 
ment’s response has mostly spelled in- 
convenience, at least so far. Graduate stu- 
dents have had to at least temporarily aban- 
don visiting research posts that they had won 
through prestigious programs, such as the 
European Union’s Erasmus Mundus schol- 
arship. Postdoctoral researchers planning to 
travel abroad are in limbo. “A Turkish post- 
doc who was going to come to my lab had 
to cancel and lost a month’s rent,’ Utah's 
Sekercioglu says. Turkish universities have 
been contacting their researchers abroad, but 
some are still waiting for any word. “Nobody 
has called me back yet,” says a Turkish scien- 
tist on sabbatical at a U.S. university who re- 
quested anonymity. The purge troubles him, 
he says, but he is “very glad that the coup did 
not succeed.” 

Many academics fear the clampdown is just 
the beginning. It is a sign that “academic free- 
doms will no longer exist” in Turkey, predicts 
Sinem Arslan, a Turk doing doctoral work in 
political science at the University of Essex in 
the United Kingdom. The government has 
urged rectors to hand over the names of uni- 
versity members suspected of having ties to 
coup organizers, academics report; many fear 
the move will encourage ideological blacklist- 
ing. Government officials “want to take the 
universities under their full control,’ Arslan 
says. “I don’t think that anybody will be able 
to work on research areas that are considered 
taboo by the government or write anything 
that criticizes the government.” Erdogan has 
been hostile to the women’s rights movement, 
for example, and to academics who express 
support for the country’s Kurdish minority 
(Science, 25 March, p. 1381). 

Many see the purge as an opportunistic 
power grab by the ruling Justice and De- 
velopment Party, and maintain that plans 
to quash dissent have been in the works for 
years. Critics of the government had hoped 
to find embarrassing evidence of those 
plans in nearly 300,000 internal govern- 
ment emails released by the organization 
WikiLeaks days after the coup. But the 
cache, which includes emails dating back 
2010, has so far yielded little. 
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EUROPE 


Uncertainty reigns in Brexit Britain 


As U.K. universities counsel their E.U. staff, researchers see first signs of fallout 


By Erik Stokstad, in Manchester, U.K. 


othing has changed, yet everything 

has changed. In the weeks since the 

United Kingdom voted to leave the 

European Union, the U.K. government 

has made it clear that foreign resi- 

dents won't be kicked out, and their 
legal rights remain intact—and that might 
only change after new treaties are forged, a 
process that could take years. But when the 
University of Oxford held an information ses- 
sion on Brexit this month, the meeting was 
packed and several hundred scientists and 
other staff crammed into an overflow room. 
“The general feeling was anxiety,’ says Ian 
Walmsley, pro-vice-chancellor for research 
and innovation at Oxford. 

Researchers from other parts of the Euro- 
pean Union, who make up 16% of academic 
staff at U.K. universities, are anxious about 
their status, and—like their U.K. colleagues— 
concerned about access to European grant 
funding and research facilities. Reassuring 
details are scarce. In a speech at the Euro- 
Science Open Forum (ESOF), the largest 
general scientific meeting in Europe, held 
here from 23 to 27 July, a tired looking Jo 
Johnson, U.K. minister for universities and 
science, could only tell delegates: “I recognize 
the demand for further clarity on these issues 
and I’m working intensively with colleagues 
across government to provide it as soon as 
practicable.” He took no questions. 

After the 23 June referendum, the pres- 
sure group Scientists for EU asked re- 
searchers about their concerns. Among the 
400 responses, 46 reported hearing about 
or experiencing xenophobia. And at least 
84 people were planning to leave the United 
Kingdom or know someone who is. “We are 
very concerned about recruitment and reten- 
tion of academic and research staff,’ says Paul 
Crowther, an astrophysicist at the University 
of Sheffield in the United Kingdom. 

The U.K. government announced on 
ll July that the referendum did not change 
the rights of E.U. nationals. Nevertheless, the 
message to foreign scientists must be stron- 
ger, says Venki Ramakrishnan, president of 
the Royal Society in London. “They need to 
be reassured, if they are here and employed, 
they’re not suddenly going to be told that 
they have to apply for permission or leave.” 

Larger political developments have added 
to the anxiety. The new prime minister, 
Theresa May, did little to comfort university- 
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based scientists when she formed her new 
cabinet earlier this month: Responsibility 
for universities was split from research and 
placed in a different department. Johnson, 
who was appointed minister for universities 
and science in May 2015 and remains in the 
role, now has to report to two departments. 
Universities are starting to take matters 
into their own hands. At Imperial College 
London (ICL), where 25% of staff and 20% of 
students hail from other countries in the Eu- 
ropean Union, the human resources depart- 
ment has set up 10 sessions from now until 
September with a law firm to explain how 
to apply for residency and citizenship. The 
university has also offered interest-free loans 
to help cover the cost of applications. “The 
biggest risk to our science right now is un- 
certainty and misperception about studying 


33 described disruptions in Horizon 2020 
consortia, such as being asked not to partici- 
pate. “It’s a chilling effect on collaboration,” 
says Kieron Flanagan, a science policy expert 
at the University of Manchester. 

Some say the government should take 
steps to forestall discrimination against U.K. 
participants in Horizon 2020 grants. Before 
and during negotiations over Brexit, for ex- 
ample, it could guarantee to make up for any 
funding lost because of the separation. That 
way E.U. applicants would be assured that 
U.K. collaborators can bring money for the 
duration of a project. “We will continue to 
push for that,” Ramakrishnan says. 

After the separation, the United Kingdom 
could buy into Horizon 2020 as an associate 
member, as some other non-E.U. nations have 
done. But some are skeptical that the govern- 
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and working in the U.K.” Maggie Dallman, 
associate provost of ICL, said in a statement. 
“We must keep shouting that U.K. science is 
open for business and help the government 
ensure that this remains the case.” 

As for participation in Horizon 2020, the 
European Union’s giant, 7-year funding pro- 
gram, the E.U. directorate for research said 
on 4 July that U.K. scientists can still ap- 
ply for and participate in grants while the 
United Kingdom remains part of the Euro- 
pean Union. “No one should have any doubt 
about it,’ Carlos Moedas, the commissioner 
for research, told a large audience at ESOF. 
“Horizon 2020 projects will continue to be 
evaluated based on merit and not on nation- 
ality” There is anecdotal evidence, however, 
that some E.U. researchers and managers 
perceive that U.K. involvement might now 
pose a liability in an E.U. grant application. 
Among the Scientists for EU responses, 
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ment will replace much, if any, of the 10% of 
science funding that currently comes from 
the European Union. “I doubt if we will per- 
suade the government to increase the money 
to make up for E.U. funds,” Walmsley says. 

The scientific community is gearing up 
to push for the best possible deal once the 
divorcing partners begin discussing terms. 
And Ramakrishnan says he’s optimistic 
about future participation in E.U. science 
programs. “As one of the strongest science 
countries in Europe, we're also valuable to 
the rest of Europe,” he says. Another rea- 
son for hope, according to James Wilsdon, 
a science policy expert at the University of 
Sheffield, is that in the difficult Brexit ne- 
gotiations, participants will seek easy areas 
of agreement first—and science could be 
recognized as a mutual win. 


With reporting by Tania Rabesandratana. 


29 JULY 2016 + VOL 353 ISSUE 6298 437 


Downloaded from http://science.sciencemag.org/ on July 28, 2016 


FEATURES 


FORBIDDEN PLANE 


How the zoo of exoplanets 


has turned planet formation theory upside down 


hen astronomers discovered 
the first exoplanet around a 
normal star 2 decades ago, 
there was joy—and bewilder- 
ment. The planet, 51 Peg- 
asi b, was half as massive as 
Jupiter, but its 4-day orbit 
was impossibly close to the 
star, far smaller than the 
88-day orbit of Mercury. Theorists who 
study planet formation could see no way 
for a planet that big to grow in such tight 
confines around a newborn star. It could 
have been a freak, but soon, more “hot 
Jupiters” turned up in planet searches, and 
they were joined by other oddities: planets 
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in elongated and highly tilted orbits, even 
planets orbiting their stars “backward’— 
counter to the star’s rotation. 

The planet hunt accelerated with the 
launch of NASA’s Kepler spacecraft in 
2009, and the 2500 worlds it has discov- 
ered added statistical heft to the study 
of exoplanets—and yet more confusion. 
Kepler found that the most common type 
of planet in the galaxy is something be- 
tween the size of Earth and Neptune—a 
“super-Earth,” which has no parallel in our 
solar system and was thought to be almost 
impossible to make. Now, ground-based 
telescopes are gathering light directly from 
exoplanets, rather than detecting their 
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presence indirectly as Kepler does, and 
they, too, are turning up anomalies. They 
have found giant planets several times the 
mass of Jupiter, orbiting their star at more 
than twice the distance Neptune is from 
the sun—another region where theorists 
thought it was impossible to grow large 
planets. Other planetary systems looked 
nothing like our orderly solar system, chal- 
lenging the well-worn theories that had 
been developed to explain it. 

“It’s been really obvious things didn’t 
fit pretty much from day one,” says Bruce 
Macintosh, a physicist at Stanford Uni- 
versity in Palo Alto, California. “There has 
never been a moment when theory has 
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The exoplanet HR 8799 b, 

a super-Jupiter (seen froma 
speculative moon), takes 
460 years to orbit its star. 


caught up with observations.” 

Theorists are trying to catch up— 
coming up with scenarios for growing 
previously forbidden kinds of planets, in 
places once thought off-limits. They are en- 
visioning how planets could form in much 
more mobile and chaotic environments 
than they ever pictured before, where na- 
scent planets drift from wide to narrow 
orbits or get ricocheted into elongated or 
off-kilter paths by other planets or passing 
stars. But the ever-expanding zoo of exotic 
planets that observers are tallying means 
every new model is provisional. “You can 
discover something new every day,” says 
astrophysicist Thomas Henning of the Max 
Planck Institute for Astronomy in Heidel- 
berg, Germany. “It’s a Gold Rush situation.” 


THE TRADITIONAL MODEL of how stars 
and their planets form dates back to the 
18th century, when scientists proposed that 
a slowly rotating cloud of dust and gas 
could collapse under its own gravity. Most 
of the material forms a ball that ignites 
into a star when its core gets dense and hot 
enough. Gravity and angular momentum 
herd the leftover material around the proto- 
star into a flat disk. Dust is key to trans- 
forming this disk into a set of planets. The 
dust, which accounts for a small fraction 
of the disk’s mass, is made up of micro- 
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Super-Earths rising 


Hot Jupiters came first. The Kepler spacecraft then found super-Earths, also in close-in orbits, to be the most 
common type of exoplanet. Ground-based telescopes are now directly seeing distant giants like HR 8799 b. 
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scopic specks of iron and other solids. As 
they swirl in the roiling disk, the specks 
occasionally collide and stick together by 
electromagnetic forces. Over a few million 
years, the dust builds up into grains, peb- 
bles, boulders, and, eventually, kilometer- 
wide planetesimals. 

At that point gravity takes over, pulling 
in other planetesimals and vacuuming up 
dust and gas until planet-sized bodies take 
shape. By the time that happens in the in- 
ner part of the disk, most of its gas has 
been stripped away, either gobbled up by 
the star or blown away by its stellar wind. 
The dearth of gas means inner planets re- 
main largely rocky, with thin atmospheres. 

This growth process, known as core 
accretion, proceeds faster in the outer 
parts of the disk, where it is cold enough 
for water to freeze. The ice beyond this 
“snowline” supplements the dust, allowing 
protoplanets to consolidate more quickly. 
They build up a solid core five to 10 times 
the mass of Earth—quickly enough that the 
disk remains gas-rich and the core can pull 
in a thick atmosphere, producing a gas gi- 
ant like Jupiter. (One of the goals of NASA’s 
Juno spacecraft, which arrived at Jupiter 
earlier this month, is to see whether the 
planet really does have a massive core.) 

This scenario naturally produces a planet- 
ary system just like our own: small, rocky 
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planets with thin atmospheres close to the 
star, a Jupiter-like gas giant just beyond 
the snowline, and the other giants getting 
progressively smaller at greater distances 
because they move more slowly through 
their orbits and take longer to hoover up 
material. All the planets remain roughly 
where they formed, in circular orbits in the 
same plane. Nice and tidy. 

But the discovery of hot Jupiters sug- 
gested something was seriously amiss with 
the theory. A planet with an orbit mea- 
sured in days travels an extremely short 
distance around the star, which limits the 
amount of material it can scoop up as it 
forms. It seemed inconceivable that a gas 
giant could have formed in such a location. 
The inevitable conclusion was that it must 
have formed farther out and moved in. 

Theorists have come up with two possi- 
ble mechanisms for shuffling the planetary 
deck. The first, known as migration, re- 
quires there to be plenty of material left in 
the disk after the giant planet has formed. 
The planet’s gravity distorts the disk, cre- 
ating areas of higher density, which, in 
turn, exert a gravitational “drag” on the 
planet, causing it to gradually drift inward 
toward the star. 

There is supporting evidence for the 
idea. Neighboring planets often end up in 
a stable, gravitational relationship known 
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The new normal 


Astronomers thought they knew how planets form by studying one system: 
our own (this page). But in the past 20 years, they have discovered seemingly 
impossible exoplanets (opposite page), turning theories on their heads. 


Ice giants 

Move slowly and 
form slowly, ending 
up smaller. 


as orbital resonance. This happens when 
the lengths of their orbits are in a ratio of 
small whole numbers. Pluto, for example, 
orbits the sun two times for every three 
orbits of Neptune. It’s highly unlikely that 
they just happened to form that way, so they 
must have drifted into that position, where 
they were locked in by the extra stability. 
Migration early in our solar system’s his- 
tory could account for other oddities, includ- 
ing the small size of Mars and the sparse, 
disrupted asteroid belt. To explain them, 
theorists have invoked a maneuver called 
the grand tack, in which Jupiter originally 
formed closer to the sun, drifted inward al- 
most to the orbit of Earth, and then drifted 
out again to its current position. 

Some modelers find such scenarios un- 
necessarily complex. “I do have faith in 
Occam’s razor,’ says Greg Laughlin, an as- 
tronomer at the University of California 
(UC), Santa Cruz. Laughlin argues that 
planets are more likely to form in place 
and stay put. He says it’s possible for large 
planets to form close to their star if proto- 
planetary disks contain much more material 
there than previously believed. Some move- 
ment of planets may still occur—enough to 
explain resonances, for example—but “it’s 
a final subtle adjustment, not a major con- 
veyor belt,’ Laughlin says. 

But others say that there simply could not 
be enough material to form close-in plan- 
ets like 51 Pegasi b and others that are even 
closer. “They cannot have formed in situ,” 
physicist Joshua Winn of the Massachusetts 
Institute of Technology in Cambridge de- 
clares flatly. And the sizable fraction of exo- 
planets that appear to be in elongated, tilted, 
or even backward orbits also seems to imply 
some kind of planet shuffling. 

For these oddballs, theorists invoke a grav- 
itational melee rather than a sedate migra- 
tion. A mass-rich disk could produce many 
planets close together, where gravitational 
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Gas giants 

Form a solid core 
quickly, pulling ina 
thick gas envelope. 


tussles would fling them into the star, into 
weird orbits, or out of the system. Another 
potential disruptor is a companion star in an 
elongated orbit. Most of the time it would be 
too far away to have an influence, but occa- 
sionally it could swing in and stir things up. 
Or, if the parent star is a member of a tight- 
knit stellar cluster, a neighboring star might 
drift too close and wreak havoc. “There are 
a lot of ways to break a system,” Winn says. 


KEPLER’S SURPRISING FINDING that 60% of 
sunlike stars are orbited by a super-Earth, 
however, requires a whole new class of 
theories. Most super-Earths, thought to 
be largely solid rock 
and metal with modest 
amounts of gas, follow 
tighter orbits than Earth, 
and often a star has sev- 
eral. The Kepler-80 sys- 
tem, for example, has four 
super-Earths, all with or- 
bits of 9 days or less. The 
traditional theory holds 
that inside the snowline 
core accretion is too slow 
to produce something so 
large. And _ super-Earths 
are rarely found in reso- 
nant orbits, suggesting that they haven’t mi- 
grated, but formed where they sit. 
Researchers are coming up with ways 
around the problem. One idea is to speed 
up accretion, through a process known as 
pebble accretion. The gas in a rich disk ex- 
erts a lot of drag on pebble-sized objects. 
This generally slows them down, causing 
them to drift in toward the star. If they pass 
a planetesimal along the way, their slow 
speed means they can be captured more 
easily, boosting accretion. But faster ac- 
cretion and a gas-rich disk raise their own 
problem: The super-Earths ought to pull 
in a thick atmosphere once they exceed a 
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An image of HL Tauri’s protoplanetary 
disk. Are planets forming in the gaps? 


Snowline 

Outside this boundary, 
water ice helps solid cores 
form quickly. 


thick atmospher 


certain size. “How do you keep them from 
becoming gas giants?” asks astrophysicist 
Roman Rafikov of the Institute for Ad- 
vanced Study in Princeton, New Jersey. 

Eugene Chiang, an astronomer at UC 
Berkeley, says there is no need to speed up 
accretion, so long as the disk is solid-rich and 
gas-poor. He says that an inner disk 10 times 
denser than the one that formed the solar 
system could easily produce one or more 
super-Earths. Chiang has his super-Earths 
avoid collecting too much residual gas by 
forming in the dying days of the disk when 
most of the gas has dissipated. 

Some early observations from the Ata- 
cama Large Millimeter/ 
submillimeter Array 
(ALMA), an international 
facility nearing comple- 
tion in northern Chile, 
support this proposal. 
ALMA can map radio 
emissions from the warm 
dust and gravel in disks. 
The few it has studied so 
far seem to be relatively 
massive. But the observa- 
tions aren’t yet a smoking 
gun, because ALMA is not 
yet fully operational and 
it can only see the outer parts of disks, not 
the regions where super-Earths reside. “Get- 
ting close in, that’s the trick,’ Chiang says— 
something that ALMA may perform when all 
66 of its antennas are working. 

Chiang also has an explanation for another 
discovery of Kepler’s: superpuffs, a rare and 
equally problematic set of planets that have 
a smaller mass than super-Earths but appear 
huge, with a puffed-up atmosphere mak- 
ing up 20% of their mass. Such planets are 
thought to form in a gas-rich disk. But in the 
inner disk, warm gas would fight against the 
planet’s weak gravity, so the cold and dense 
gas of the outer disk is the more likely womb. 
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eird orbits 


Asign that something disrupted 


the system in the past. 


Superpuff 
Asmall core with 
an infeasibly 

big atmosphere. 


Chiang invokes migration to explain their 
close orbits—a notion supported by the fact 
that superpuffs are often found locked in 
resonant orbits. 


MOST OF THE ATTENTION in exoplanet re- 
search has so far focused on the inner parts 
of planetary systems, roughly within a dis- 
tance equivalent to the orbit of Jupiter, for 
the simple reason that that’s all existing 
detection methods can see. The two main 
methods—measuring the wobble of stars 
caused by the gravitational tug of an orbiting 
planet and measuring the periodic dimming 
of a star as a planet passes in front—both 
favor big planets in close orbits. Imaging the 
planets themselves is extremely difficult, be- 
cause their faint light is all but swamped by 
the glare from their star, which can be a bil- 
lion times brighter. 

But by stretching the limits of the world’s 
biggest telescopes, astronomers have seen 
a handful of planets directly. And over the 
past couple years, two new instruments de- 
signed specifically to image exoplanets have 
joined the hunt (Science, 21 February 2014, 
p. 833). Europe’s Spectro-Polarimetric High- 
contrast Exoplanet REsearch (SPHERE) and 
the U.S.-backed Gemini Planet Imager (GPI) 
are attached to big telescopes in Chile and 
employ sophisticated masks, called corona- 
graphs, to block out the light of the star. Not 
surprisingly, planets far from their stars are 
the easiest targets. 

One of the earliest and most astounding 
systems found by direct imaging is the one 
around the star HR 8799, where four plan- 
ets range in orbits from beyond that of Sat- 
urn out to more than twice the distance of 
Neptune. What’s most surprising is that all 
four are huge, more than five times the mass 
of Jupiter. According to theory, planets in 
such distant orbits move so slowly that they 
should grow at a glacial rate and top out at 
masses well short of Jupiter’s before the disk 
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Super-Earth 
Arocky leviathan, 

but how did it avoid 
becoming a gas giant? 
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disperses. Yet the planets’ nice circular orbits 
suggest they weren’t flung there from closer 
to their stars. 

Such distant giants lend support to the 
most radical challenge to standard theory, 
in which some planets form not by core ac- 
cretion, but by a process called gravitational 
instability. This process requires a gas-rich 
protoplanetary disk, which breaks up into 
clumps under its own gravity. These blobs 
of gas would collapse over time directly into 
giant planets without having to form a solid 
core first. Models suggest that the mecha- 
nism will only work in particular circum- 
stances: The gas has to be cold, it mustn’t 
be spinning too fast, and the contracting 
gas must be able to shed heat efficiently. 
Can it explain the planets of HR 8799? Only 
the outer two are distant and cold enough, 
Rafikov says. “It’s still quite a puzzling sys- 
tem,” he says. 

In the past, radio telescope observations 
of protoplanetary disks have provided some 
support for gravitational instability. Sensi- 
tive to cold gas, the telescopes saw disks spat- 
tered with messy, asymmetrical blobs. But 
recent images from ALMA paint a different 
picture. ALMA is sensitive to shorter wave- 
lengths that come from dust grains in the 
midplane of the disk, and its images of the 
star HL Tauri in 2014 and TW Hydrae this 
year showed smooth, symmetrical disks with 
dark circular “gaps” extending far beyond 
Neptune-like orbits (see picture, p. 440). “It 
was a tremendous surprise. The disk was 
not a mess, but has a nice, regular, beautiful 
structure,’ Rafikov says. These images, sug- 
gestive of planets sweeping their orbits clean 
as they grow by core accretion, were a blow 
to advocates of gravitational instability. 

It’s too early to tell what other surprises 
GPI and SPHERE may find in the outer 
reaches of planetary systems. But the region 
between those outlying neighborhoods and 
the close-in domains of hot Jupiters and 
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Telescopes cannot yet 
identify planets in 
middle-distance orbits. 


Distant super-Jupiters 
Did they form a solid core 
first, or condense straight 
from the gas disk? 


super-Earths remains stubbornly out of 
reach: too close to the star for direct imag- 
ing, too far for indirect techniques relying on 
stellar wobbles or dimming. As a result, it is 
hard for theorists to get a full picture of what 
exoplanetary systems are like. “We’re basing 
things on fragmentary and incomplete ob- 
servations,” Laughlin says. “Right now, every- 
one’s probably wrong.” 

Astronomers won't have to wait long for 
better data. Next year, NASA will launch 
its Transiting Exoplanet Survey Satellite 
(TESS), and the following year the European 
Space Agency (ESA) is expected to launch 
the Characterizing Exoplanets Satellite 
(CHEOPS). Unlike Kepler, which surveyed a 
large number of stars in sparse detail to com- 
pile an exoplanetary census, TESS and CHE- 
OPS will focus on bright, sunlike stars close 
to Earth, enabling researchers to explore the 
midorbit terra incognita. And because the 
targeted stars are nearby, ground-based tele- 
scopes should be able to assess the mass of 
their planets, allowing researchers to calcu- 
late the planets’ density, indicating which are 
rocky or gassy. 

The James Webb Space Telescope, due 
for launch in 2018 (Science, 19 February, 
p. 804), will go further, analyzing starlight 
that passes through an exoplanet’s atmo- 
sphere to determine its makeup. “Compo- 
sition is an important clue to formation,’ 
Macintosh says. For example, finding heavier 
elements in the atmospheres of super-Earths 
could suggest that a disk rich in such ele- 
ments is needed to form planetary cores fast 
enough. And next decade, spacecraft such as 
NASAss Wide Field Infrared Survey Telescope 
and ESA’s Planetary Transits and Oscillations 
will join the hunt, alongside a new genera- 
tion of enormous ground-based telescopes 
with mirrors 30 meters across or more. 

If the past is anything to go by, modelers 
will have to keep on their toes. “Nature is 
smarter than our theories,” Rafikov says. 
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Parasitic plants—A CuRe for what ails thee 


A parasitic plant is perceived by its host plant in a similar manner to microbes 


By Vardis Ntoukakis! and 
Selena Gimenez-Ibanez” 


arasitic plants cause dramatic changes 
in ecosystems and represent a seri- 
ous risk to agriculture by attacking 
crops of high economic importance. 
A highly conserved part of plant 
immune systems is the recognition 
of microbe-associated molecular patterns 
(MAMPs) by plasma membrane-localized 
pattern recognition receptors (PRRs) that 
initiate an effective immune response upon 
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activation (J). Whether parasitic plants are 
also sensed as foes by these receptors was un- 
til now unknown. On page 478 of this issue, 
Hegenauer et al. report the identification of 
a canonical PRR that is required for respon- 
siveness to a MAMP-like molecule from the 
parasitic plant Cuscuta reflexa and protects 
plants against parasitic attack (2). This find- 
ing opens the possibility of biotechnological 
applications for sustainable crop protection 
against these devastating parasites. 

The ability of plants to store energy from 
sunlight makes them attractive targets for 
pests and parasites looking to exploit their 
carbohydrates, nutrients, and water. Not 
surprisingly, parasitic organisms, including 
microbes, insects, nematodes, and parasitic 
plants, attack plant hosts as their primary 
source of nutrition. The biological threat 
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mounted by parasitic organisms can cause 
devastating agricultural loses and can pro- 
foundly influence natural ecosystems. Para- 
sitic plants are estimated to cost billions of 
dollars annually in crop losses across five 
continents, with particular impact in Africa (3). 

Seeds of parasitic plants survive in soil for 
years, but after germination they must find 
a susceptible host plant within a few days. 
Plants of the genera Striga and Cuscuta have 
sophisticated mechanisms to sense host 
plants by perceiving molecules exuded by the 
host—a strategy that dramatically increases 
the chances of a successful infection (3). 
Once in contact with the host, the survival 
of the parasite depends on the rapid forma- 
tion of specialized feeding structures called 
haustoria that penetrate the host cell wall 
and siphon nutrients directly from vascular 
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tissue (4). Haustoria formation is a common 
pathogenicity strategy also used by fungi and 
oomycetes. Therefore, the virulence strategy 
of parasitic plants seems to share common- 
alities with parasitic microbes, raising the 
possibility that molecular aspects of host- 
parasite interaction may also be conserved. 

Hegenauer et al. investigated the inter- 
action between the plant stem parasite 
C. reflera and cultivated tomato (Solanum 
lycopersicum), which is resistant to this para- 
site (5). They found that tomatoes respond 
to C. reflexa extracts with immune pheno- 
types similar to those associated with MAMP 
perception, which suggests that C. reflexa 
indeed produces a MAMP-like molecule. To 
identify the source of host perception asso- 
ciated with the Cuscuta factor, the authors 
screened a population of tomatoes differ- 
ing in the ability to recognize the elicitor. 
They used a clever series of genetic tricks to 
assign Cuscuta factor responsiveness to a 
canonical host PRR. The new receptor con- 
tains external leucine-rich repeats and as- 
sociates constitutively with protein kinases 
of the SOBIR (suppressor of BAK1-interact- 
ing receptor kinase) type (6), a hallmark of 
this type of PRR. The authors renamed this 
receptor CuRel (Cuscuta receptor 1), which 
becomes the first surface-localized receptor 
found to recognize a parasitic plant. 

The authors could not identify the struc- 
ture of the Cuscuta factor precisely, but it 
appears to be a small, potentially O-glyco- 
sylated peptide that is associated with, and 
possibly a structural component of, the C. re- 
JSlexa cell wall. The Cuscuta factor is present 
in all Cuscuta species tested but is absent 
from the related nonparasitic species Caly- 
stegia sepium or from unrelated parasitic 
plants, which contrasts with the current 
paradigm that MAMPs are widely con- 
served in a given class of organisms (7)—in 
this case, plants themselves. This poses an 
obvious problem: If plants could respond to 
general plant factors, a predictable outcome 
might be autoimmunity. This implies that 
the Cuscuta factor precisely identifies mem- 
bers of this genus. It is tempting to specu- 
late that the Cuscuta factor might be a cell 
wall component intimately associated with 
a parasitic lifestyle or a common cell wall 
protein with a specific posttranslational 
modification. The narrow distribution spec- 
trum of the Cuscuta factor likely reflects 
the strong selection pressure against plants 
perceiving themselves as nonself. 

PRRs can function when transferred 
across plant species barriers (8, 9). To evalu- 
ate the potential of CuRel to protect sus- 
ceptible plants from C. reflewa attack, the 
authors used genetic engineering techniques 
to transform the CwRel gene into closely 
and distantly related susceptible plants. The 
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CuRel-transformed lines became sensitive to 
the Cuscuta factor and were more resistant to 
C. reflexa attack. Thus, CuRel has the poten- 
tial to protect crop plants against infestation 
by Cuscuta. 

Remarkably, Hegenauer et al. found that 
cultivated tomato has mechanisms of resis- 
tance against C. reflera in addition to CuRel. 
This accords with what is known about the 
multiple layers of incompatibility between 
parasitic Striga species and their hosts (10). 
For microbial pathogens, haustoria serve as 
secretion sites for virulence proteins called 
effectors that promote pathogenesis but may 
be recognized by intracellular host immune 
complexes encoded by resistance (R) genes, 


Cultivated tomato 
S. lycopersicum 


C. reflexa 


are perceived by PRRs. Another outstanding 
question is whether parasitic plants inject 
effectors into their host and whether R pro- 
teins recognize these effectors, as in the case 
of parasitic microbes. We also need to know 
why these effectors are not active on their 
parasites of origin. 

On the basis of these new results, interfam- 
ily transfer of CuRel emerges as a feasible 
strategy for crop protection. Furthermore, 
standardized bioengineering and gene edit- 
ing approaches will accelerate the engineer- 
ing of receptors with novel ligand specificities 
(12, 13). This is the beginning of an exciting 
journey that will allow us to understand the 
intracellular dialog during parasitic plant- 
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Know your enemy. During infection, the tomato (S. lycopersicum) surface-localized immune receptor CuRel detects the 
presence of the Cuscuta factor released from the cell wall of the parasitic plant C. reflexa. CuRel contains external leucine- 
rich repeats and associates constitutively with SOBIR-type protein kinases. Heterologous expression of CuRel into closely 
and distantly related susceptible plants confers resistance against the attack of C. reflexa. PM, plasma membrane. 


strongly reactivating immunity (J). Host re- 
sistance to Striga relies on classical R proteins 
(11); it therefore follows that parasitic plants 
may secrete their own effectors via haustoria, 
and that these in turn might be recognized 
by the host immune system in resistant 
hosts. In fact, the authors noted that tomato 
plants lacking CuRel were not susceptible 
to Cuscuta infection, implying further active 
resistance mechanisms in this species. Puta- 
tive Cuscuta effectors would have to be very 
highly evolved so that they are active in the 
host but not within the parasitic plant. 

The identification of CuRel represents 
a major breakthrough in understanding 
the strategies used by plants to sense dan- 
ger from diverse origins. This work greatly 
advances our understanding of the mecha- 
nisms controlling plant resistance to para- 
sitic plants while at the same time opening 
up new avenues of research. For example, we 
need to understand the nature of the Cuscuta 
factor and whether other parasitic plants 
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plant associations with broad applications in 
biology and agriculture. | 
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SYNTHETIC BIOLOGY 


On the record with E. coli DNA 


DNA can be engineered for cells to record events in real time 


By Olivier Borkowski,'? Charlie Gilbert,'” 
Tom Ellis!” 


ynthetic biology uses DNA programs 

to add new functions into living cells. 

By expressing transcription factors, 

microbes such as Escherichia coli can 

be made to perform complex compu- 

tational logic (7) and pass “memories” 
of selected events between generations (2). 
But DNA is more than just the source code 
for protein expression programs; DNA can be 
used as the storage medium for information. 
In synthetic biology, the use of DNA for in 
vivo information storage was first realized in 
2009 with a synthetic genetic program that 
enabled FE. coli to count events by using a 
recombinase to rearrange DNA in response 
to an input (3). With multiple recombinases, 
this technology could be used to store 1.375 
bytes of information in a living E. coli (4). As 
reported by Shipman et al. (5) on page 463 
of this issue, and by Roquet e¢ ai. (6), in vivo 
encoding of information into DNA is pushed 
even further, using either genome editing to 
store dozens of bytes of data, or employing 
multiple recombinases to realize “state ma- 
chines” inside living cells. 

The clustered regularly interspaced short 
palindromic repeats (CRISPR)-Cas system 
is a prokaryotic adaptive immune system, in 
which CRISPR regions within the genome 
capture foreign DNA sequences during in- 
fections. These are later used by Cas9 to rec- 
ognize and digest foreign DNA bearing the 
same sequence. The DNA sequences incorpo- 
rated by CRISPR, called protospacers, form 
a memory of all previous infections encoun- 
tered by the cell. 

To create living cells engineered to record 
information into their DNA, Shipman et al. 
exploited the natural recording capacity of 
CRISPR-Cas using the native E. coli system 
to integrate multiple, user-defined proto- 
spacer sequences into CRISPR arrays. Using 
electroporation, bacteria take up oligonucle- 
otides 35 base pairs (bp) in length and inte- 
grate this DNA as protospacers. Because the 
temporal order of protospacer acquisition 
events translates to a physical order in the 
CRISPR array, the record of what informa- 
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tion (DNA sequences) was given to the cells 
at what time can be retrieved by sequencing 
the full CRISPR locus (see the figure). Ship- 
man e¢ al. integrated 3 sets of 5 oligonucle- 
otide protospacers (15 elements) in this way. 
To expand the data storage capacity of 
their system, Shipman et al. generated mod- 
ified versions of Casl1 and Cas2 capable of in- 
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ilar amounts of information in vivo, it can 
be much more readily exploited by having 
it interact with native and synthetic gene 
expression programs. Roquet et al. used 
recombinase-based memory to build a 
classic motif from computation—the state 
machine. State machines perform order- 
dependent input processing, so that at any 
point the system can be interrogated and 
the identity and order of inputs deduced. 
In response to external inputs, multiple re- 
combinases recorded memory into a DNA 
“register,’ a synthetic sequence hosted on a 
plasmid consisting of an array of nested rec- 
ognition sites for the different recombinases. 
Depending on the inputs and thus on the 
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Molecular recording devices. (Left) In the CRISPR-based recording device, short oligonucleotide sequences are 
integrated into a genomic CRISPR array between short spacer sequences (light gray diamonds) through the action of 
the Casl-Cas2 complex. The order of sequential oligonucleotide integrations (colored squares) is preserved in the spatial 
arrangement of the CRISPR locus. The information recorded can then be retrieved by next-generation sequencing of 

the CRISPR locus. (Right) For the recombinase-based state machine, small molecules control the expression of recom- 
binases. The recombinases act on a DNA register, consisting of nested recombinase recognition sites (orange and blue 
triangles and half-ellipses). Depending on the order of recombinase activation, the DNA register adopts different states 
that can be read by DNA sequencing (or by reporter-protein expression). 


tegrating oligonucleotides in either forward 
or reverse orientations. Theoretically, this 
recording device can store between 20 and 
100 bytes of DNA information in the genome 
of living EF. colt. However, as CRISPR arrays 
are naturally capable of harboring hundreds 
of protospacers, it is likely that the storage 
capacity per cell could easily be increased. 
Although the method of rearranging DNA 
with recombinases is unable to store sim- 
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order of activation of the different recombi- 
nases, the register is rearranged into distinct 
orientations (see the figure). With just three 
inputs controlling three recombinases, it is 
possible to build a 16-state machine in E. colt. 
Theoretically, with seven inputs and seven 
orthogonal recombinases, it would be possi- 
ble to encode 13,700 distinct states. 

To reveal the order and type of inputs 
received by any cell or population of cells, 
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the DNA registers can simply be sequenced 
from the bacteria, or can also be quickly de- 
termined by quantitative polymerase chain 
reaction. Alternatively, the register sequences 
can be designed to contain genetic parts 
such as promoters and transcription factors. 
When these functional registers rearrange 
into certain states, they can trigger gene 
expression changes that lead to different 
E. coli phenotypes. Roquet et al. created a 
database of thousands of regulatory pro- 
grams in which the cell’s behavior is tied to 
the state of the register. Using these biologi- 
cal outputs means that the state machine can 
lead the cells to take actions in response to 
the appropriate string of inputs. 

Currently, the inputs used by both the 
CRISPR and recombinase systems are simple 
molecules. Roquet et al. used small molecules 
to activate recombinase expression, whereas 
Shipman e¢ al. transformed synthetic oligo- 
nucleotides into cells. However, both studies 
suggest ways to feed in complex cellular phe- 
notypes as inputs. By replacing recombinase 
regulation with alternative genetic control 
systems, Roquet et al. propose that state 
machines could monitor environmental sig- 
nals or gene expression. Similarly, Shipman 
et al. suggest using their device to capture 
gene expression information by convert- 
ing native messenger RNA transcripts into 
protospacers through the action of reverse 
transcriptase. As both Shipman et al. and 
Roquet et al. suggest, these strategies could 
be used to generate intrinsic devices within 
cells that autonomously record the timing of 
complex and inaccessible processes, such as 
gene dysregulation in cancer or developmen- 
tal gene regulatory cascades. As McKenna et 
al. (7) describe on page 462 of this issue, an 
autonomous method that continually mu- 
tates a genomic locus using CRISPR-Cas has 
provided valuable information for whole- 
organism genetic lineage tracing. 

The methods used by Shipman e¢ al. 
and by Roquet et al. represent new bench- 
marks in both in vivo information record- 
ing and biological computation, and show 
that DNA can be predictably edited over 
generations of living cells. Linking these 
DNA memories with the inherent power 
of cells to sense and act on their envi- 
ronments will no doubt lead to many ex- 
citing advances for synthetic biology. 
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CLIMATE CHANGE 


The smoking gun for 
Atlantic circulation changes 


Deep-sea sediments record changes in ocean 
circulation during the last glacial age 


By Andreas Schmittner 


he Atlantic meridional overturning 

circulation (AMOC) has long been 

implicated in rapid climate changes 

during the most recent ice age. 

However, convincing observational 

evidence from the deep ocean has re- 
mained limited. On page 470 of this issue, 
Henry et al. (1) present two independent 
records from a well-dated, high-resolution 
sediment core from Bermuda Rise in the 
North Atlantic (see the figure) that may be 
the smoking gun that paleoceanographers 
have been looking for. 

The AMOC is part of a system of cur- 
rents that spans Earth’s oceans (2). In the 
Atlantic, these currents flow northward 
near the surface from the southern hemi- 
sphere across the equator, via the Gulf 
Stream, and into the subpolar North At- 
lantic. There, the chilled and salty waters 
sink to depths of ~2 to 3 km and flow back 
south along the margin of the Americas 
into the Southern Ocean as North Atlantic 
deep water (see the figure). This flow pat- 
tern transports heat from the southern to 
the northern hemisphere 
and keeps the North At- 


lantic region warm and CANADA 
Antarctica cool (3). 

During the most re- 
cent ice age, rapid and UNITED 


large climate changes oc- STATES 


curred at time scales of 
several centuries to mil- 
lennia. These Dansgaard- 
Oeschger (D-O) events 
were characterized by 
abrupt warming and 
cooling in the North At- 
lantic and more gradual 
temperature changes of 
the opposite sign around 
Antarctica (4). This 
asymmetric response of 
surface temperatures is 
consistent with disrup- 
tions to the interhemi- 
spheric heat transport 
by the AMOC. Theory 


and modeling suggests 57° 34.559'W). 
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that rapid transitions between different 
AMOC states can be triggered by freshwa- 
ter fluxes, which could cause the observed 
changes in surface temperatures (5). 

Despite evidence from surface tem- 
perature reconstructions that support 
the idea of AMOC variability causing the 
D-O events, doubts have lingered because 
surface temperatures can be affected by 
many processes. Evidence from the deep 
sea would be more convincing because the 
AMOC is, after all, a deep-sea phenom- 
enon. But few high-resolution, well-dated 
paleoceanographic records from the deep 
sea exist, hampering unequivocal attribu- 
tion. Moreover, paleoceanographic recon- 
structions are indirect and rely on proxies, 
making it difficult to infer ocean circula- 
tion changes. 

One such proxy is the ratio of two radio- 
isotopes, protactinium and thorium (Pa/ 
Th), which are produced constantly at a 
ratio of 0.093. Deviations from this ratio 
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Deep ocean changes. Based on data from a deep-sea core at Bermuda Rise, 
Henry et al. argue that particularly cold phases during the last glaciation were 
associated with reduced ocean circulation in the North Atlantic. The map 
depicts today’s deep-sea circulation near the core (located at 33° 41.443’ N, 
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have been interpreted as the effect of the 
AMOC, although other secondary processes 
also influence Pa/Th (6). Another AMOC 
proxy is the standardized carbon isotope 
ratio in the fossil shells of bottom-dwell- 
ing (benthic) foraminifera (5"C,,); these 
organisms record the carbon isotope ratios 
of dissolved inorganic carbon in the wa- 
ter column. 65¥C,,, is also affected by other 
processes, but those can be simulated in 
three-dimensional (3D) models (7) and dif- 
fer from the secondary processes affecting 
Pa/Th. 

Thus, if secondary processes dominated 
the variations of Pa/Th and/or 35°C,» we 
would not expect to see any correlation 
between them. However, Henry et al’s 
Pa/Th and 8”C,.. records are highly cor- 
related both with each other and with sea 
surface temperature (SST) variations in a 
nearby sediment core. In their reconstruc- 
tion, cold phases in the North Atlantic are 
~2°C cooler in the subtropical North At- 
lantic. Pa/Th increases during these cold 
phases, approaching the production value, 
and 6¥C,,, decreases by ~0.5 per mil. In 3D 
climate model simulations, collapse and 
resumption of the AMOC caused by fresh- 
water perturbations to the North Atlan- 
tic results in similar changes in SSTs and 
5°C,,, at the core location (8). 

This agreement indicates that at least 
some of the ice age D-O events, particu- 
larly those accompanied by massive ice 
berg rafting, were associated with large 
AMOC changes and perhaps even AMOC 
collapses. But Henry et al’s data also in- 
dicate a variety of responses, with other 
events showing smaller changes. 

In the future, combining high-quality 
paleoceanographic reconstructions with 
3D circulation models that directly simu- 
late the proxies may allow AMOC changes 
to be quantified throughout the most re- 
cent ice age. This could help to better as- 
sess how AMOC changes affect ecosystems 
and societies, such as shifts in the inter- 
tropical convergence zone, ocean produc- 
tivity, and carbon cycle, and how likely 
they are to occur in the future as a result 
of human-caused climate change (9). 
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ECOLOGY 


Plant extinctions take time 


Many plant species may already be functionally extinct 


By Quentin Cronk 


he recent State of the World’s Plants re- 
port from the Royal Botanic Gardens, 
Kew (J) estimates that 50,000 of the 
~390,000 known vascular plant spe- 
cies are at risk of extinction. Given the 
rarity of so many plants, coupled with 
widespread environmental destruction over 
the past quarter-century, we might expect 
that a lot of plants should have gone extinct. 
Indeed, estimates made in the early 1990s 
suggest that up to 30,000 species should 
have gone extinct by 2015 (2, 3). Yet, the In- 
ternational Union for Conservation of Nature 
(IUCN) Red List of Threatened Species data- 
base for 2016 has fewer than 150 extinct spe- 
cies. How can we explain this discrepancy? 

Proving an absence is an age-old problem 
in science. To prove the ex- 
istence of an organism, one 
can collect a specimen and 
put it in a museum collec- 
tion. Proving that an organ- 
ism does not exist is more 
problematic. It is always 
possible that we have not 
looked hard enough. As a 
result, the answer to the seemingly simple 
question of how many plant species have 
become extinct in the Anthropocene is that 
no one really knows. But this does not mean 
that there is no problem. Many plant species 
may be on an inevitable path to extinction, 
even though isolated specimens can survive 
for decades or more. 

To derive global plant extinction rates ex- 
pected by 2015, multiple studies in the early 
1990s estimated likely species losses based 
on the habitat area expected to be lost (2, 3). 
This species-area model is potentially prob- 
lematic (4). But even if the model is flawed, 
the underlying concept is uncontroversial: 
less habitat, fewer species. The studies ar- 
rived at the conclusion that between 4000 
and 30,000 species would be extinct by 2015 
(2, 3). In contrast, the IUCN Red List data- 
base of 2016 lists only 142 extinct plants; 105 
of these are completely extinct, and the re- 
maining 37 are extinct in the wild but sur- 
vive in cultivation. The discrepancy between 
the recorded extinctions and earlier expec- 
tations cannot be explained by lower-than- 
expected habitat destruction. In fact, many 
new threats to tropical forest ecosystems 
have emerged since 1990, notably the expan- 
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“.. we [may be] 
facing slow-creeping 
biodiversity loss on 
a large scale.” 


sion of industrialized agriculture driven by 
increased demand for soy and palm oil. 

Two possible explanations present them- 
selves. First, there may be many undocu- 
mented extinctions. It is true that extinction 
lists are highly conservative. Compilers have 
to be cautious, because even when an organ- 
ism is declared extinct, there is always the 
possibility that it will be rediscovered. How 
much searching has to be done before a spe- 
cies that has not been seen for more than 50 
years is declared extinct? And who will do 
the searching? Biologists generally prefer to 
do fieldwork in places where species survive, 
not degraded areas where extinction has oc- 
curred. For practical reasons, a complete list 
of extinct plants may be impossible to obtain. 
Even so, it is hard to imagine that we would 
miss thousands of predicted extinctions. 

Second, there may be a 
long extinction lag time. 
The species-area curve is 
driven by equilibrium phe- 
nomena, and ecosystems 
may take a long time to 
equilibrate. Diamond has 
used the term “relaxation 
time” to describe this ex- 
tinction lag (5). Janzen calls the species in 
this extinction waiting room “the living 
dead” (6). He noticed that the agricultural 
landscapes that replaced native forest in 
Costa Rica were not devoid of native trees. 
Forest remnants hung on at field margins 
and in small forest patches. However these 
trees could not regenerate because habitat 
suitable for seedlings no longer existed. The 
trees, although living out their physiological 
life, were “just as dead...as if they were in 
the back of a logging truck” (6). 

Numerous factors influence the extinc- 
tion lag time (7). Broadly, these can be di- 
vided into those intrinsic to species (such as 
longevity of individuals, presence of a seed 
bank, and sensitivity to inbreeding depres- 
sion) and extrinsic factors (such as spatial 
scales and patch structure). Long-lived spe- 
cies in large areas will have long extinction 
lag times and vice versa. 

There are several reasons that plants 
should survive longer than animals as liv- 
ing dead. First, plants may have seed banks 
in the soil; until these seed banks are ex- 
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The slow-burning fuse of plant extinction. (A) This conceptual scheme relates the extinction debt (the number of spe- 
cies expected to become extinct as the result of an extinction-causing event) to the relaxation time (the lag time to com- 
plete this process). (B) Small areas and fast demographic turnover promote rapid extinction and hence short persistence 
of the extinction debt. (C) Large areas or slow demographic turnover promote slow extinction and long persistence of 
extinction debt. (D and E) Conservation interventions can have different effects. Type 1 conservation reduces the extinction 
debt and thus prevents extinction (D); type 2 conservation extends the extinction lag time and thus delays extinction (E). 


hausted, occasional plants may appear. Some 
invertebrates have stages that can diapause 
in lake mud for multiple years, but this strat- 
egy is generally rarer in animals than in 
plants. Second, few animals have life spans 
matching those of woody plants, which may 
live hundreds of years. An exception is Lone- 
some George, the last known Pinta tortoise 
(Chelonoidis abingdoni), who extended the 
living-dead phase to almost plantlike pro- 
portions by his longevity. Third, many plants 
can reproduce asexually or self-fertilize, 
and the last individual plant may therefore 
produce occasional successors, whereas the 
single last animal rarely does. 

There are thus strong reasons to expect the 
relaxation times for plant extinction to greatly 
exceed that of animals. Indeed, the processes 
and parameters of plant extinction may be 
quite different from those of animals. It is 
therefore important to understand what hap- 
pens during the relaxation time. In this vein, 
Downey and Richardson recently introduced 
the concept of a plant extinction trajectory 
that passes through several defined stages (8). 

How long can plant relaxation times be? 
The South Atlantic island of St. Helena pro- 
vides an interesting case study. The Portu- 
guese navigators who discovered it in 1502 
quickly introduced goats. Without predators, 
the goats multiplied into huge flocks. Most 
vegetation was destroyed, and plants became 
extinct. What is remarkable, however, is the 
tenacity of the living dead. For instance, the 
St. Helena olive (Nesiota elliptica) fell to an 
unsustainable population level of 12 to 15 
plants in the mid-19th century (9) but only 
became extinct in 2003. It was arguably just 
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as extinct with a population of less than 10 
in 1900 as it is now. The extinction lag times 
for woody plants on St. Helena are thus mea- 
sured in many centuries. Studies on temper- 
ate forest herbs similarly indicate lag times 
of more than 100 years (JO), whereas small 
mammals in tropical forest fragments have 
median extinction lag times of only 13 years (11). 

This slow-burn extinction in plants raises 
a number of questions. The first is a practi- 
cal one. If human actions over the past 25 
years have set in train a mass extinction 
that will take 100 years or more to play out, 
then how do we identify the living dead and 
what should be our response? At the very 
least, the long plant relaxation time allows 
us to sequence the whole genomes of the 
rarest plants, so that we will know a little 
more about the organisms we lose (and un- 
derstand, or even address, some of the ge- 
netic problems they face). As sequencing 
costs drop, a coordinated global program 
of rarity genomics is something that can be 
considered. 

The second question is whether we can 
use the relaxation time as a window of op- 
portunity for conservation. Can we drive the 
equilibrium back to a state that retains more 
species? The answer is probably a cautious 
yes. Another way of expressing the difference 
between functional extinction (that is, the 
living dead) and census extinction (where no 
individuals survive) is through the concept of 
“extinction debt,’ a term coined by Tilman et 
al. from metapopulation dynamics (12). Ex- 
tinction debt is the number of extinctions 
expected to result, sooner or later, from an 
extinction-causing event. Facing the slippery 
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problem of quantifying census extinction, ex- 
tinction debt may be a more useful approach 
(13). It may be possible, by smart ecological 
restoration and reserve selection, to pump 
“species credit” into ecosystems to reduce 
extinction debt (74). 

There are some hopeful stories of spe- 
cies brought back from the brink sustain- 
ably. An example of a critically rare species 
that has possibly been moved into the sus- 
tainably safe zone is the Mauritius kestrel, 
which was made almost extinct by anti- 
malarial DDT use in the 1950s and 1960s, 
but intensive management from 1974 
worked (15). The kestrel escaped irrevers- 
ible genetic problems only because of the 
very short duration of its bottleneck and 
the total cessation of DDT use in 1970. The 
success of conservation projects will de- 
pend on whether they merely prolong the 
extinction lag time or whether they reduce 
the extinction debt by dealing with funda- 
mental ecosystem and genetic problems, 
moving species into the sustainably safe 
zone (see the figure). 

At this juncture, rather than estimating 
census extinction, it is important that sci- 
entists develop better models for assessing 
global extinction debt and maximizing spe- 
cies credit. If the high 1990s estimates of 
plant extinction by 2015 are accurate esti- 
mates of extinction debt, then we are facing 
slow-creeping biodiversity loss on a large 
scale. Failing to notice this because the 
time scale is too long would not be smart. & 
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Countering unprecision in 
precision medicine 


Better coordination is needed to study complex interventions 


By Spencer Phillips Hey and 
Aaron S. Kesselheim 


he goal of precision medicine (PM) 

is to “ensure that the right treatment 

is delivered to the right patient at the 

right time” (1). Predictive biomarker di- 

agnostics are critical to this effort. Yet 

despite substantial promise, PM has 
been plagued with problems (2). Many com- 
mercially available biomarker diagnostics 
have not been adequately validated (3); the 
scientific literature is flooded with low-qual- 
ity and unreliable reports (2); and even os- 
tensibly successful PMs, such as trastuzumab 
(Herceptin) chemotherapy in HER2-express- 
ing breast cancer, have been characterized by 
uncertainty about how to use and interpret 
diagnostic test results (4). These disappoint- 
ing features of PM research can be explained 
by three obstacles inherent in the science: (i) 
Biological theories play a central role in the 
testing methodology; therefore, pivotal trials 
of PMs cannot be agnostic about underlying 
mechanisms. (ii) Interventions are complex 
with many components and degrees of un- 
certainty that need to be resolved before clin- 
ical use. (iii) No single stakeholder controls 
the biomarkers or coordinates the research 
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program. Although some new regulatory (5) 
and evidence synthesis (6) efforts are de- 
signed to address these problems, we believe 
that meaningful progress in PM requires new 
mechanisms of scientific oversight. 


THEORY-DRIVEN RESEARCH 
The gold standard for evaluating the efficacy 
of an experimental drug is a two-arm ran- 
domized controlled trial (RCT). One virtue of 
this method is its agnosticism about underly- 
ing biological theory. An RCT can decisively 
test the clinical hypothesis “treatment T is 
effective for condition C” without needing a 
theoretical explanation for why T is effective. 
Most PM has a more ambitious aim: to 
discover and harness a true biological expla- 
nation for why a drug will work for an in- 
dividual patient. Hypotheses take the form: 
“Treatment T is effective for condition C, as 
defined by testing positive for biomarker B, 
where B is determined by diagnostic assay 
A?” Additional assumptions—why A is a reli- 
able test for B; why B should predict activ- 
ity of T against C—are now “built into” the 
hypothesis, so decisive tests of PM cannot 
be agnostic about underlying theory. If the 
theory is wrong, and we are not careful about 
testing it, the result will be systematic mis- 
classification and worse patient outcomes— 
exactly what PM intends to avoid. This does 
not mean that etiological theories must be 
perfect, but we do need to actually test the 
underlying hypotheses before PMs are un- 
leashed into the clinic. 


Published by AAAS 


Using biomarkers as outcomes in clinical trials 
requires more attention to validation. 


Although various experimental modali- 
ties can provide some evidence of biomarker 
validity (e.g., retrospective designs, enrich- 
ment designs), a decisive test of a biomarker 
hypothesis requires that participants are 
prospectively stratified according to both 
biomarker status and treatment assignment 
(7). Unfortunately, this only rarely happens in 
PM development (8). 


COMPLEX INTERVENTIONS 

Although the unit of clinical translation in 
traditional drug development is a single ther- 
apeutic agent, the unit in PM is a “biomarker 
ensemble”: (i) a biomarker, hypothesized to 
play a crucial role in the disease pathway; 
(ii) a diagnostic assay, used to determine a 
patient’s biomarker status; and (ili) a thera- 
peutic agent, intended to be more effective 
for patients who are “biomarker-positive.” 
This increased complexity, with uncertain- 
ties about each component of the ensemble, 
means that there are more threats to the va- 
lidity of a PM trial. When biomarker or assay 
misclassification rates are not known or not 
reported—which happens frequently (9)—the 
study results are difficult to interpret. In the 
case of a negative result: Was the therapy in- 
effective, the biomarker a poor predictor of 
therapeutic response, or the diagnostic un- 
able to accurately classify patients according 
to their true biomarker status? Was it some 
combination of these factors? 

This problem of underdetermination 
frustrates efficient falsification of bio- 
marker theories (10). The front-loading and 
multiplying of clinical uncertainties com- 
bined with deficiencies in theory testing can 
explain why physicians have been reluctant 
to trust PM, insurers have been reluctant to 
reimburse for diagnostics, and companies 
have been less willing to invest (2). 


CONTROL AND COORDINATION 
Unlike a drug, a biomarker is not manufac- 
tured or controlled by a single entity. Much 
biomarker research operates under an “open 
science” model, which permits relatively free 
access to testing and development of bio- 
markers and allows individual stakeholders 
to make independent judgments about the 
state of evidence and the allocation of re- 
search resources. This also means that there 
is no single stakeholder with authority or 
responsibility for directing the scientific re- 
search effort surrounding a particular bio- 
marker, which can lead to inefficiencies at 
the research system level. 

These represent potentially wasteful and 
inefficient activities. An open science model 
has many attractive features, but it does not 
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have mechanisms to restrict stakeholders 
from pursuing a hypothesis or a particular 
method, even when many in the community 
judge that it is no longer valuable or reason- 
able to do so. A stakeholder’s judgment about 
why a candidate PM or a driving theory has 
failed may not resonate throughout the scien- 
tific community in an efficient way. 

The lingering controversy about the ap- 
propriate assay or cut-off score for distin- 
guishing HER2-positive from HER2-negative 
breast cancers illustrates how greater over- 
sight is needed for what should count as a 
successful PM validation (4). Despite the 
wide use of HER2 testing in the clinic, un- 
certainty remains about how to test for the 
biomarker and interpret the results. Trials 
designed to resolve these uncertainties are 
under way, but ideally, these issues should 
have been resolved before the test was 
widely adopted in clinical practice. Until the 
optimal biomarker ensemble is identified, 
patients are being systematically misclassi- 
fied and harmed as a result. 


ETHICS AND POLICY 

Failure to conduct scientifically valid research 
programs undermines public support for 
the medical research enterprise and fails to 
protect the interests of patient-subjects. The 
ethics of medical research demands that pa- 
tient-subject burdens are redeemed by gains 
in generalizable knowledge. Many human 
and material resources have been expended 
in pursuit of PMs (/, 2). Unfortunately, these 
costs are often not adequately redeemed by 
scientific gains. For example, the Evaluation 
of Genomic Applications in Practice and Pre- 
vention (EGAPP) initiative, which provides 
systematic summaries and recommendations 
for clinicians and health payers about the 
utility of particular genomic tests, has most 
often found that there is “insufficient evi- 
dence” to support genetic testing (6). 

This suggests that existing social mecha- 
nisms, such as peer review, the commercial 
market, and broad public-funding priori- 
ties—which are responsible for “passively” 
coordinating medical research activities 
under an open science model—are inad- 
equate to efficiently resolve the many un- 
certainties in PM. It may also indicate that 
institutional review boards (IRBs) lack the 
authority and access to information about 
the total biomarker research portfolio nec- 
essary to safeguard patient and public in- 
terests. Therefore, we believe new policy 
interventions are needed. 

The Food and Drug Administration (FDA) 
has already issued draft guidance on bio- 
marker trials that acknowledges the dan- 
gers of failing to test the driving biomarker 
theories (17) and is contemplating formal 
regulation of biomarker tests (5). We support 
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these efforts and recommend that regulators, 
health care providers, and insurers explic- 
itly demand rigorous, biomarker-stratified 
trials for all PMs and companion diagnos- 
tics before licensure, clinical uptake, and 
reimbursement. 

For their part, researchers need to adhere 
to existing reporting guidelines (9) and to 
participate in biomarker study registration 
platforms (72). It is impossible for the sci- 
entific community to account for the com- 
plexity of biomarker ensembles if reports of 
previous research are inadequate or unreli- 
able. Journal editors and research funders 
will need to enforce compliance more strictly 
with these measures as conditions of publi- 
cation and funding. 

However, to overcome obstacles to scien- 
tific coordination stemming from the lack 
of centralized control and oversight over 
biomarker research, we believe more radical 


“Existing social mechanisms 
... are inadequate to 
efficiently resolve the many 
uncertainties in PM.” 


change in the mechanisms of scientific over- 
sight is needed. President Obama’s Precision 
Medicine Initiative is a promising develop- 
ment in this respect, but it is not clear that its 
focus on developing a large patient cohort to 
produce new knowledge about disease etiol- 
ogy and management will substantially affect 
the inefficiencies with clinical translation. 

Large U.S. research institutions, such 
as the National Human Genome Research 
Institute and National Cancer Institute, al- 
ready play a central role in setting national 
research priorities. But the coordinating ca- 
pacity of these institutions is dependent on 
the applications submitted to them by inves- 
tigators, and insofar as the overall research 
portfolio is not transparent, waste and inef- 
ficiency remain likely. 

We therefore propose that the major re- 
search institutions should combine their ef- 
forts with regulatory bodies and explicitly 
take on the responsibility for coordinating 
and evaluating PM research activities. Once a 
promising PM biomarker is identified, these 
institutions should prospectively map out the 
parameter space of the biomarker ensemble 
and then track the accumulating state of evi- 
dence (J0). Unlike existing evidence synthe- 
sis efforts, which provide a recommendation 
based on a snapshot of the total evidence at 
some point in time, a centrally hosted, pub- 
licly accessible, and dynamic portfolio map 
would help the scientific community to ef- 
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ficiently communicate and to coordinate its 
activities in real time. 

Specific advantages of this approach 
would include the following: (i) Funding 
agencies could award responsibility for spe- 
cific regions of the parameter space through 
their grants, which would mitigate research 
redundancy and waste; (ii) research teams 
would not have to sift through the literature 
or wait for systematic reviews but could in- 
stead immediately identify combinations of 
biomarker parameters needing replication 
and validation, as well as combinations that 
have been neglected, which indicate a gap 
in testing the theory; (iii) IRBs would have 
access to broader knowledge about the total 
state of evidence, which would allow them to 
make more informed judgments of social and 
scientific value. 

This approach would complement and ex- 
tend the evidence-grading schemes for PMs 
(13) and would reveal the patterns of research 
activities that warrant more or less attention. 
It would also help elucidate a much-needed 
standard for when a decisive test of a bio- 
marker’s clinical utility is possible. When the 
parameter space for a biomarker ensemble 
has been well-explored and the optimal en- 
semble has been identified, only then are the 
biomarker and its underlying theory ready 
for a pivotal trial. 

Obstacles to efficient resolution of PM 
uncertainties are not going to go away and 
will require us to acknowledge and face the 
scientific difficulty with an appropriate meth- 
odology. The approach to PM inquiry we 
have outlined, which should be led by public 
research institutions, will help achieve the 
promise that the field offers. 
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Living in a microbial world 


Microbes take center stage in a charming tour 
of Earth’s microscopic menagerie 


By Susan Perkins 


nce you've experienced “the transfor- 

mation,” you never look at the world 

in the same way. For me, it occurred 

over 3 years of curating a public ex- 

hibit on the human microbiome at 

the American Museum of Natural 
History. After totally immersing myself in 
the topic, I viewed every bite of food, every 
handshake, every scratch of a dog’s head 
through a different lens. How was this ac- 
tion shaping my microbiome—the com- 
munity of trillions of microorganisms that 
called my body home? Ed Yong, the author 
of the new book J Contain Multitudes: The 
Microbes Within Us and a Grander View of 
Life, also experienced this transformation, 
and he’s written a delightful, witty book that 
will surely be a pathway for his readers to 
do the same. 

For most of human history, we were 
unaware of microbes. Then in the 1670s, 
the clever Dutch drapemaker Antonie van 
Leeuwenhoek crafted a handheld contrap- 
tion that could magnify objects more than 
250 times. He used it to explore pond wa- 
ter, insect guts, and samples taken from his 
own body, observing tiny “animalcules’— 
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the first glimpses of living bacteria and 
protozoa. 

In the ensuing decades, new discover- 
ies linked microbes to human health and 
showed that certain ones were responsible 
for specific ailments. This “germ theory” 
of disease has been the prevailing view in 
medicine ever since. 

With better culturing methods and early 
DNA sequencing efforts, the census of the 
microbial menagerie that calls our planet— 
and our bodies—home began to expand. 
Next-generation sequencing methods have 
turbocharged that discovery and revealed a 
previously unimaginable diversity of novel 
microbes. 

Each one of us has a different pattern 
of microbial diversity that can correspond 
to variations in metabolism, disease, and 
even behavior. This discovery promises to 
be transformative for medicine and bio- 
technology. Acknowledging the growing 
momentum in this field, the White House 
Office of Science and Technology Policy 
recently announced the creation of the 
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Although they evolved independently, ant-eating 
mammals—including the pangolin (shown), armadillos, 
and aardvarks—all possess similar gut microbes. 


National Microbiome Initiative, a collabora- 
tive effort of federal, public, and private en- 
tities to accelerate microbiome science. 

In 10 chapters, Yong vividly describes 
the intricate alliances forged by microbes 
with every other organism on the planet. 
He guides us on a historical journey, show- 
ing how key discoveries of symbioses, such 
as early work on the microbes that inhabit 
termite guts, have been crucial for opening 
up the “microbiome universe” to scientists. 
We learn how bacteria allow the Hawaiian 
bobtail squid to glow, how they permit pea 
aphids to subsist on nothing but plant sap, 
and how the wood rat has coopted them 
so it can digest otherwise toxic creosote. 
We also learn a lot about parasitic bacteria 
called Wolbachia that infect insects all over 
the globe, manipulating their behavior and 
reproductive fitness. As Yong reveals, these 
particular bacteria may be useful as a vec- 
tor control strategy in our fight against 
disease-bearing hosts like mosquitos. 

Any book on microbiomes would be 
remiss if it did not discuss dysbiosis—a 
condition in which microbes and host fall 
out of balance—and Yong does not disap- 
point, talking about things like emerging 
Clostridium difficile infections and the 
collapse of coral reef ecosystems. But he 
wisely never falls into the trap of labeling 
any microbes as inherently “good” or “bad” 
and stops short of predicting how condi- 
tions like obesity, infections, and allergies 
will soon be a thing of the past (as less rig- 
orous journalists have been known to do 
when writing about this burgeoning field). 

The most delightful part of Yong’s book 
is that he does not just tell the stories of 
microbiomes, he also introduces readers 
to dozens of the scientists studying them. 
He visited their labs, met their germ-free 
mice, and made field trips to the zoo with 
them. He probed them with questions and 
captures their motivations and personali- 
ties perfectly. Their stories and conversa- 
tions radiate the excitement of unlocking 
new secrets, putting a human face on the 
science. 

The title of the book comes from Walt 
Whitman’s classic poem Song of Myself, in 
which he rejoices in humanity’s role in the 
great ecosystem of Earth. In 1855, Whit- 
man could hardly appreciate the extent of 
the microbial world, but Yong deftly weaves 
it into the poet’s ode. As Whitman writes: “T 
am the mate and companion of people, all 
just as immortal and fathomless as myself.” 
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HISTORY OF SCIENCE 


Interpreting evolution 


A probing history sheds light on the social and cultural 
biases that have shaped the study of human origins 


By Maurizio Meloni 


uring one of the most dramatic 
waves of racist and eugenic ideas in 
the early 20th century, the great pa- 
leontologist George Gaylord Simpson 
astutely observed that there was no 
way to avoid a transfer of knowledge 
from biology to “social evolution”: “Whether 
or not they are really pertinent, biological 
theories are being used in this field, and 
the biologist necessarily has a part in the 
discussion” (7). Such fluxes of knowledge 
from biology to the wider society and their 
implications are the focus of Marianne 
Sommer’s History Within: The Sci- 
ence, Culture, and Politics of Bones, 
Organisms, and Molecules. 

The book’s subtitle refers to the 
tripartite structure of the book. The 
section entitled “History in Bones” 
focuses on the work of American 
paleontologist Henry Fairfield 
Osborn (1857-1935) at the Ameri- 
can Museum of Natural History 
(AMNH), “History in Organisms” 
centers around the British evolu- 
tionist Julian Huxley (1887-1975) 
and his work at the London Zoo 
and other facilities, and “History in 
Molecules” centers on the Italian- 
born, Stanford-based geneticist Lu- 
igi Luca Cavalli-Sforza (b. 1922) and 
his contributions to the Human Ge- 
nome Diversity Project. 

What these three scientists have in com- 
mon is easy to see: a passion and even a 
mission to transfer biological knowledge 
and provide “meaning and orientation” to 
the wider public. Across more than 500 
pages, Sommer capably conveys the com- 
plexity of their lives, particularly the contes- 
tations and accusations that their projects 
provoked. 

Osborn, who served as president of the 
AMNH from 1908 to 1933, transformed the 
museum into a preeminent site for exhibi- 
tion, attracting millions of visitors during his 
tenure. However, he also subscribed to a lin- 
ear, nonrandom view of evolution in which 
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those of “Caucasian stock” (i.e., European) 
were positioned at the apex. Osborn, who 
presided over the Second International Con- 
gress on Eugenics in 1921 and collaborated 
extensively with Nordic supremacist Madi- 
son Grant, contributed to the perception of 
a deterioration of the “best strains of Old 
American stock” that ultimately culminated 
in the 1924 Immigration Act. 

Huxley subscribed to the so-called “mod- 
ern synthesis” (his term) of Darwinism and 
genetics. An active critic of the extreme rac- 
ism of conservative and Nazi eugenics, he 
was nonetheless a lifelong eugenicist. From 
his work with the United Nations Educa- 


Despite efforts to remain objective, scientific narratives about human 
evolution are often laden with value judgments. 


tional, Scientific, and Cultural Organiza- 
tion (UNESCO)—of which he was the first 
director—to his popular writings and inter- 
views on eugenics, demographic control, 
and transhumanism, Huxley’s enduring 
influence on the public’s ideas about evolu- 
tion in the 20th century is second to none. 

Cavalli-Sforza belongs to a different sci- 
entific and cultural context. His pioneer- 
ing work in human population genetics 
focused on DNA sequences as the “funda- 
mental objects for the reconstruction of 
human history.” Working in an epoch less 
inclined to progressionist views, Cavalli- 
Sforza nonetheless presented his scientific 
project as endowed with a moral goal: us- 
ing genetic data to demonstrate a unified 
human history as a way to combat racism. 
His efforts to collect genetic material from 
indigenous populations, however, resulted 
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History Within 

The Science, Culture, 

and Politics of Bones, 
Organisms, and Molecules 
Marianne Sommer 
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in accusations of colonialism and imposi- 
tion of outsider knowledge. 

The book centers around the idea of 
knowledge in continuous transit through 
society, negotiated between different stake- 
holders. The three scientists here described 
were undoubtedly a key source of knowl- 
edge. However, if one thinks of how much 
Osborn imbued his science with the racism 
typical of the conservative elite of which he 
was part, if one considers how much Huxley 
adopted and rebranded language from the 
eugenics movement (2), and if one assesses 
some of the naive sociological and anthropo- 
logical assumptions behind Cavalli-Sforza’s 
work (3), it becomes clear how 
much these scientists were pro- 
foundly shaped by the wider social 
values and cognitive styles of their 
own times. 

Sommer’s book is very much 
about making sense of various lay- 
ers of biological data in the context 
of human history. But the book 
stops at the work of a population 
geneticist (Cavalli-Sforza), whose 
turn-of-the-century Human _ Ge- 
nome Diversity Project aimed to 
translate long-term historical pro- 
cesses such as human migrations 
into DNA sequences. 

In subsequent years, new postge- 
nomic approaches with even more 
ambitious goals have emerged. The 
epigenome, for instance, is no lon- 
ger thought of as just an archive of epochal 
events like mass migrations. Something 
more immediate is sought in it: a micro- 
history made of local and even tiny events, 
such as diet (famine, obesity), habit (smok- 
ing, alcohol), stress and trauma, and general 
lifestyle of our most direct ancestors. It is 
as yet too early to say whether disciplines 
like social and environmental epigenetics 
will fulfill their ambitious task, but their 
practitioners would do well to heed lessons 
learned from past attempts to read human 
history into biology. 
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Clarifying samples 
in Zika analyses 


IN RESPONSE TO the Zika outbreak and 
putatively related microcephaly cases in 
Brazil, many research groups in Brazil, 
North America, and Europe are studying 
the virus (“Evidence grows for Zika virus as 
pregnancy danger,” G. Vogel, In Depth, 11 
March, p. 1123). The simultaneous efforts 
to address this urgent matter have led to 
the use of some samples in multiple stud- 
ies. There is no clear leadership or protocol 
to regulate access to patients, some of 
whom are also participants in studies of 
potential microcephaly causes unrelated to 
Zika, such as cytomegalovirus and genetic 
inbreeding (7). Some studies are report- 
ing different results for the same set of 
samples. For example, Calvet et al. (2) and 
Oliveira Melo et al. (3) acknowledge over- 
lap in the methods section. It is difficult 

to assess studies if we cannot determine 
whether, or to what degree, the cohorts are 
independent. Such confusion has already 
necessitated clarification of similar studies 
from different groups (4). 

To address this problem, the World 
Health Organization could create a uni- 
versal code for each baby with confirmed 
microcephaly and Zika infection, to be used 
by all research groups working with these 
data. The codes should include unique 
identifiers, generated by a competent gov- 
ernment agency, that indicate the country 
and institution of diagnosis as well as a 
serial number for each patient. The World 
Health Organization could also provide a 
public database for Zika cases that would 
include the Zika codes as well as epide- 
miological information. These steps would 
allow individual cases to be identified in 
multiple studies while protecting privacy. 

Joao R. M. de Oliveira* and 
Denis A. P. Moura 
Laboratério de Imunopatologia Keizo Asami, Federal 


University of Pernambuco, Recife, Pernambuco, 
50670-901, Brazil. 


*Corresponding author. Email: joao.ricardo@ufpe.br 


REFERENCES 


1. G.Frangaetal., The Lancet (2016); http://dx.doi.org 
/10.1016/S0140-6736(16)30902-3. 

2. G.Calvetetal.,Lancet Infect. Dis.16,653 (2016). 

3. A.S. Oliveira Melo etal., Ultrasound Obstet. Gynecol. 47,6 
(2016). 

4. M.de Fatima Vasco Aragao, BMJ 353, i2444 (2016). 


10.1126/science.aah3733 


452 29 JULY 2016 » VOL 353 ISSUE 6298 


Researchers studying Zika and microcephaly are 
using overlapping sets of data. 


Animal-based 
antibodies: Obsolete 


THE GLOBAL ANTIBODY industry produces 
an indispensable resource for biological, 
molecular, and cell scientists. Antibodies 
are harvested from immunized animals. 
The animals suffer side effects from the 
immunizations (J) and are, in some cases, 
mistreated (2). It is no longer necessary 

to compromise animal welfare: Since the 
mid-1990s, animals have not been required 
for antibody production (3). It is long past 
time to replace the use of animal-generated 
antibodies with nonimmunized recombi- 
nant antibodies. 

Animal-Friendly Affinity reagents (AFAs) 
are antibodies that are generated by using 
recombinant technology in viruses or 
yeasts. The technology allows cloning of 
immunoglobulin gene segments, to produce 
antibody libraries with high diversity from 
which antibodies with desired specificities 
can be chosen. These are translated on the 
surfaces of cells or phage particles, and 
exposed to the target antigen, which selects 
a highly specific antibody, after which pro- 
duction can be scaled up within cell culture. 
AFAs are commercially available and can 
also be developed in individual laboratories. 
They have wide-ranging applicability as well 
as specificity and affinity—equal or greater 
to their animal-generated counterparts—to 
a huge repertoire of antigens. They also give 
researchers greater control over antibody 
properties, generation time, and cost (4). 
Thanks to AFAs, the use of animals has 
become obsolete. 

EU Directive 2010/63/EU (5) requires the 
replacement of animals used in scientific 
procedures when alternatives exist. Yet, 
despite the maturation of a growing num- 
ber of techniques to produce AFAs, antibody 
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production using animals continues to be 
authorized. Twenty years ago, EU Member 
States were advised that “in the near 
future,” antibody production “without prior 
immunization of [animals would] avoid the 
need to use living animals” (6). That predic- 
tion was correct. It is incomprehensible that 
such needless animal use continues. 

There is little clarity about how many 
animals are used to produce antibodies. 
Only two EU Member State countries have 
published antibody production numbers. 
In 2013, the United Kingdom reported use 
of 9522 animals (7), and The Netherlands 
reported the use of 25,697 animals (8). The 
numbers do not include animals used to 
produce antibodies that were imported 
into those countries. Neither country has 
published the number of animals used for 
this purpose since 2013. 

We recommend the following actions: 
Antibody production methods that use ani- 
mal immunization should be replaced in 
EU Member States. Manufacturers outside 
the European Union should be required to 
adhere to European standards to qualify 
for import to Member States. An expert 
working group should be established to set 
up a roadmap for replacement. Programs 
should be implemented to ensure that 
animal-friendly antibody producers are 
fully supported. Subsequent reports 
from the Commission to the Council and 
the European Parliament on the statis- 
tics on the number of animals used for 
experimental and other scientific purposes 
should include data on the use of animals 
for antibody production as an independent 
category. These actions should be rein- 
forced through international cooperation 
and national agencies that can execute 
government regulation and prevent out- 
sourcing to regions where animal welfare 
is less well regulated. 
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The line between 
science and politics 


M. ENSERINK’S NEWS Feature “Peace 

of mind” (3 June, p. 1158) described 
Mohammad Herzallah’s effort to establish 
a research oasis in the West Bank. Such 

a research center could benefit not only 
the Palestinians in the region, but the 
development of science throughout the 
Middle East. Enserink rightly highlighted 
the physical difficulties of getting such an 
oasis off the ground given barriers such as 
checkpoints and securities walls. However, 
he crossed the line into politics when he 
wrote, “Some hilltops are crowned with 
the modern contours of Israeli settlements, 
a major obstacle in the quest for peace.” 


Whether you agree with this statement or 
not, it does not belong in an article about 
establishing a new scientific endeavor. 
Mel Weintraub 
Skokie, IL 60076, USA. Email: melwein7@gmail.com 
10.1126/science.aah3551 


TECHNICAL COMMENT ABSTRACTS 


Comment on “Multiple repressive 
mechanisms in the hippocampus during 
memory formation” 

Rebecca S. Mathew, Hillary Mullan, Jan 
Krzysztof Blusztajn, Maria K. Lehtinen 
Cho et al. (Reports, 2 October 2015, p. 82) 
report that gene repression after contextual 
fear conditioning regulates hippocampal 
memory formation. We observe low levels 
of expression for many of the top candidate 
genes in the hippocampus and robust 
expression in the choroid plexus, as well as 
repression at 4 hours after contextual fear 
conditioning, suggesting the inclusion of 
choroid plexus messenger RNAs in Cho et 
al. hippocampal samples. 

Full text at http://dx.doi.org/10.1126/ 
science.aafl288 


Response to Comment on 
“Multiple repressive mechanisms 
in the hippocampus during 
memory formation” 

Jun Cho, Nam-Kyung Yu, V. Narry 
Kim, Bong-Kiun Kaang 

Mathew et al. propose that many 
candidate genes identified in our 
study may reflect the events in the 
choroid plexus (ChP) potentially 
included in hippocampal samples. 
We reanalyze our data and find that 
the ChP inclusion is unlikely to affect 
our major conclusions regarding the 
basal suppression of translational 
machinery or the early translational 
repression (at 5 to 10 minutes). As 
Mathew et al. examined for a subset 
of genes at 4 hours, we agree that the 
late suppression may partly reflect 
the events in the ChP. Although the 
precise contribution of anatomical 
sources remains to be clarified, our 
behavioral analyses indicate that 
the late-phase suppression of these 
genes may contribute to memory 
formation. 

Full text at http://dx.doi.org/10.1126/ 
science.aaf2081 
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Response to Comment on 
“Multiple repressive mechanisms 
in the hippocampus during 
memory formation” 


Jun Cho,”?* Nam-Kyung Yu,”* V. Narry Kim,”?+ Bong-Kiun Kaang”+ 


Mathew et al. propose that many candidate genes identified in our study may reflect 

the events in the choroid plexus (ChP) potentially included in hippocampal samples. 

We reanalyze our data and find that the ChP inclusion is unlikely to affect our major 
conclusions regarding the basal suppression of translational machinery or the early 
translational repression (at 5 to 10 minutes). As Mathew et al. examined for a subset 

of genes at 4 hours, we agree that the late suppression may partly reflect the events in 
the ChP. Although the precise contribution of anatomical sources remains to be clarified, 
our behavioral analyses indicate that the late-phase suppression of these genes may 


contribute to memory formation. 


e have previously identified numerous 

regulatory mechanisms during long-term 

memory formation by analyzing tran- 

scriptome and translatome in mouse hip- 

pocampus. In particular, we found three 
types of repressive events: basal translational sup- 
pression of ribosomal biogenesis, learning-induced 
early translational repression of genes such as 
Nrsni (at 5 to 10 min after learning), and late 
transcriptional suppression of a subset of genes 
in the estrogen receptor signaling pathway (at 
30 min to 4 hours after learning). 

Mathew et al. (1) performed reverse transcription 
polymerase chain reaction (RT-PCR) on the sev- 
eral candidate genes at the 4-hour time point 
and proposed that choroid plexus (ChP) might 
have been included in our hippocampal sam- 
ples and that the ChP inclusion might affect the 
changes detected at 4 hours (2). It is important to 
note that an immediate and speedy isolation of 
the hippocampi at specific time points was cru- 
cial for us to minimize perturbation of gene net- 
works in the tissue because RNA in neuronal 
tissues is known to degrade rapidly after death. 
Our protocol enabled us to perform an accurate 
time course study with short time intervals using 
ribosome profiling (RPF) and RNA sequencing 
(RNA-seq). As Mathew et al. and another recent 
study (3) indicate, this conventional method of 
simple and quick isolation inevitably involves the 
inclusion of the ChP that is closely attached to the 
hippocampus (3). 


1Center for RNA Research, Institute for Basic Science, Seoul 
08826, Korea. “Department of Biological Sciences, College of 
Natural Sciences, Seoul National University, Seoul 08826, 
Korea. 

*These authors contributed equally to this work. {Corresponding 
author. Email: narry@snu.ac.kr (V.N.K.), kaang@snu.ac.kr (B.-K.K.) 
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So as to evaluate the effect of ChP inclusion on 
our transcriptome (RNA-seq) and translatome 
(RPF) data, we sought to examine the expression 
patterns of the genes that are enriched in the 
ChP and depleted in the hippocampus. To our 
knowledge, the transcriptome of the ChP has not 
been profiled in direct comparison with that of 
the hippocampus, so it is actually impossible to 
properly define the “ChP signature genes.” There- 
fore, first, we examined the genes reported to be 
expressed relatively highly in ChP cells—compared 
with neurons, oligodendrocytes, and astrocytes— 
by in situ hybridization experiments (4) (catego- 
rized here as “ChP cell signature,” 109 genes). We 
determined their relative abundance ranks among 
the genes detected in our RNA-seq data from hip- 
pocampal tissue and primary culture, and those in 
the published ChP transcriptome data, indepen- 
dently (5) (deposited in Gene Expression Omnibus, 
GSE66312). These genes showed strong enrich- 
ment in the ChP transcriptome, whereas they 
were expressed at relatively low levels in hippo- 
campal primary culture (Fig. 1, A and B). 

In our previous study, we observed a low level 
of translation efficiency of the ribosomal protein- 
coding genes from hippocampal primary culture 
as well as hippocampal tissue (2). Considering 
that the ribosome protein-coding genes are re- 
pressed translationally in the ChP cell-depleted 
culture, the basal regulation observed in our find- 
ings is unlikely to be affected by ChP inclusion. 

Although the ChP cell signature genes were de- 
tected at very low levels in the hippocampal pri- 
mary culture, they showed a relatively moderate 
level of RNA expression in hippocampal tissue 
(Fig. 1, A and B). These results indicate that, as 
Mathew et al. and another study pointed out 
(3), there might be a chance of inadvertent ChP 


inclusion in hippocampal tissue preparation. Even 
though the relative proportion of the adjacent 
ChP is tiny compared to that of the hippocampus 
as the source for the libraries, several genes that 
are highly expressed in the ChP may influence 
the transcriptome and translatome data. 

To further evaluate the contribution of the 
transcripts abundant in the ChP, we generated 
another list of genes that are expressed highly in 
the ChP transcriptome (top 15%) and expressed 
at low levels in the primary culture transcrip- 
tome (below 50%) [dubbed as “ChP enriched 
(vs Hippo PC)”, 369 genes]. We then examined 
how many differentially expressed genes (DEGs) 
that we identified at different time points belong 
to the two ChP-specific signature sets. Only one 
gene (Dnah6) of the “ChP enriched (vs Hippo 
PC)” and none of the “ChP cell signature” genes 
was included among the DEGs at early phases 
(among 25 DEGs at 5 min and 16 DEGs at 10 min) 
(Fig. 1C). Notably, NrsnI, which we investigated 
as an example of the early repressed genes, is not 
highly expressed in ChP. We previously showed 
that Nrsn1 was down-regulated by neuronal ac- 
tivity in hippocampal primary culture and that 
the mice with Nrsni overexpression in hippo- 
campal neurons exhibited deficits in long-term 
memory formation (2). Therefore, most of the 
significant changes detected at early phases are 
likely to be solely attributed to the hippocampus. 
Furthermore, behavioral results revealed the func- 
tional significance of an early repressed gene in 
hippocampal neurons. 

Unlike the early events, the DEGs at 30 min 
and at 4 hours after learning included many 
genes that belong to the two ChP-specific signa- 
ture sets. Especially, a large proportion of the 
DEGs that showed transcriptional repression 
were members of either one or both of the sets 
(24 out of 42 DEGs at 30 min and 37 genes out 
of 55 DEGs at 4 hours) (Fig. 1C), whereas none 
of the activated DEGs are enriched in the ChP. 
We therefore do not exclude the possibility that 
the late persistent down-regulation may reflect 
at least partly the changes in the ChP rather than 
those in the hippocampus. 

However, Mathew et al. show that seven out 
of the nine ChP abundant genes they tested by 
qRT-PCR were down-regulated in the hippocam- 
pus as well as in the ChP. Four of them reached 
statistical significance, which is the same num- 
ber of significantly decreased genes in the ChP. 
Among the down-regulated DEGs at 4 hours, 
many genes were also previously identified as 
repressed genes in the hippocampus after learn- 
ing (6). These reports, including ours, consistently 
show the gene repression after conditioning in 
the hippocampal transcriptomes. To conclusively 
clarify whether the learning-induced gene regu- 
latory events occur in the hippocampus or the 
ChP or both, we must await subsequent studies 
using advanced methods. 

Last, one of the fundamental purposes of our 
recent study was to find and suggest unknown 
mechanisms that are important for long-term 
memory formation. Our behavioral analysis upon 
the activation of ESRI, a potent upstream regulator 
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Fig. 1. Analysis of transcriptional and translational alteration of ChP-specific genes after contextual fear conditioning. (A) Box plot showing the 
normalized sequencing tag counts of all genes and ChP cell signature genes (4) (109 genes) in the RNA-seq libraries from hippocampal tissues and primary 
cultures (GSE72064) and ChP (GSE66312). (B) Cumulative distribution of the normalized tag counts of all genes and ChP cell signatures [alternative rep- 
resentation of (A)]. (C) Scatter plots showing relative changes of whole genes (small gray dots) and DEGs (large dots edged with thick lines). The different 
colors indicate whether a DEG belongs to neither of the two ChP-specific sets (cyan), to either set (yellow or pink), or to both sets (red). 


for a large portion (about half) of the repressed 
DEGs at late phase, underpins the importance of 
the late persistent down-regulation of these genes 
for long-term memory formation: We have vali- 
dated that a large number of DEGs at 4 hours are 
under the control of the ESR1 signaling pathway 
and that ESR1 agonist injection into the hippo- 
campus impaired hippocampus-dependent mem- 
ory formation in an object location task with 
nonstressful training, as well as contextual fear 
conditioning (2). ESR1 agonist may have directly 
acted on the hippocampus, but we cannot ex- 
clude the possibility that the agonist affected the 
ChP cells by diffusion via cerebrospinal fluid or 
blood vessels. However, regardless of the routes 
of ESR1 agonist action, these results indicate the 
functional importance of ESR1 signaling-mediated 
repression in memory formation. The possible in- 
volvement of the ChP leads to an intriguing hy- 
pothesis that the altered transcriptomic changes 
in the ChP may affect the composition of the 
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cerebrospinal fluid, eventually regulating the hip- 
pocampal plasticity during memory formation, 
which remains to be investigated. 

To summarize, a small amount of ChP could 
have been included in our hippocampal samples, 
because it is technically very difficult to perfectly 
separate the two regions in a short time by using 
the widely used dissection method. Given that 
much research on hippocampal transcriptome to 
date has shared the common isolation method 
we used (2), we agree that Mathew et al. raised 
an important issue to be addressed in future 
studies. However, our data set is a rich source of 
valuable information on transcriptomic and trans- 
latomic changes after learning. Most of the gene 
expression changes reflect the response from the 
hippocampus, which constitutes most of the sam- 
ple volume unless the genes are highly specific to 
the ChP. The potential inclusion of the ChP does 
not affect our major conclusions regarding the 
basal translational suppression of translation 


machineries or the early translational repres- 
sion of specific genes after learning. For some 
of the down-regulated genes at the late phase 
after contextual fear conditioning, we should 
consider the possibility that their changes in 
the transcriptomes may have been affected by 
ChP inclusion. Further analysis may be required 
to determine whether these regulations occur in 
the hippocampus, ChP, or both, and which event 
is critical for long-term memory formation. 
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Some’ species of mussels: 
are affected by rising. 
‘temperatures and acidity. 
levels inthe ocean, two* 
symptoms of climate change: ‘ 


AAAS Pacific Division explores climate-change communication 


Annual meeting symposium considers multidisciplinary approaches 


By Andrea Korte 


Organizations that have studied climate-change education efforts— 
including AAAS, the National Science Foundation, the National 
Oceanic and Atmospheric Administration, NASA, and others— 
have found that simply providing scientific data is not spurring 
the necessary public and political responses. To connect with 
audiences about the science of climate change and its wide- 
ranging human impacts, new approaches that draw from across 
the sciences—and the humanities—are necessary, said experts at 
the AAAS Pacific Division annual meeting. 

“The world around us is seeing the impacts of climate change, 
and it’s important for us to explain what’s going on,” said Michel 
Boudrias, department chair of environ- 
mental and ocean sciences at the Uni- 
versity of San Diego. Boudrias advocates 
an approach to communicating about 
climate change that can “bring in other 
disciplines and other dimensions to get 
the message across,” he said. 

The science of climate change—and its 
effects on western North America—was a 
recurring subject at the Pacific Division meeting, held 14-17 June 
at the University of San Diego. In several symposia and lectures, 
scientists presented new findings, including about the effects of 
rising ocean temperatures on sea creatures in the Pacific Ocean 
and the presence of aerosol particles in San Diego’s atmosphere. 
The 2016 meeting, which marked 100 years since the Pacific 
Division’s first official gathering, drew hundreds of researchers, 
educators, and students from a diverse range of fields. 

AAAS has long been a home for interdisciplinary collabora- 
tion, said Rush Holt, AAAS CEO and executive publisher of the 
Science family of journals, at a 15 June town hall meeting for 
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“It’s important to emphasize 
how climate change affects 


people here and now.” 
Tiffany Lohwater, AAAS 


the Pacific Division. AAAS sees “all disciplines represented,” and 
it supports scientific work that may not fall within traditional 
boundaries, Holt said. The association’s interdisciplinary empha- 
sis dates to its founding in 1848, Holt said, when its organizers 
disbanded their scientific societies focused on individual disci- 
plines to form AAAS to advance science as a whole. 

Robert Louis Chianese—professor emeritus of English at 
California State University, Northridge, and, with Boudrias, an 
organizer and presenter of the symposium on innovative methods 
for communicating climate change—recommended grounding 
climate-change communication in the humanities and center- 
ing it on narrative. “Make it personal,” said Chianese, who has 
explored the role that storytelling, documentary, and personal 
confessions can have in spurring action. 
Chianese recounted his encounter with a 
fishing-boat captain turned environmen- 
talist who gave up fishing after witnessing 
too much waste—and bringing in fewer 
and fewer full-size creatures. “It moved 
me because of the teller’s transforma- 
tion,’ Chianese said of the captain’s tale. 
“Such stories have a way of drawing us 
in, tying us to the specifics of the character and the situation in a 
way that media coverage of abuses against nature often does not. 
Hearing stories of personal confessions of transgressions against 
nature can prompt others to reevaluate their relationship to the 
natural world.” 

The captain’s tale echoes an example from literature of the 
power of confessional storytelling, Chianese said: Samuel 
Taylor Coleridge’s 18th-century poem, “The Rime of the Ancient 
Mariner.” After a thoughtless sailor kills an albatross, his guilt 
prompts him to tell his tale and impart environmental wisdom, 
Chianese said. 
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AAAS Pacific Division Executive Director Roger Christianson (left) and Rush Holt 


Tom Fehrenbacher, a teacher at a San Diego high school, also 
sees a role for the humanities in conveying the science of climate 
change. “Science is lame without the humanities, and humanities 
without the sciences are blind,” he said, recasting a quote from 
Albert Einstein on the connection between science and religion. 
“Humanities bring value to the discussion,” Fehrenbacher added, 
“put we need to have our values informed by fact.” 

Gathering and interpreting the facts also requires broad 
collaboration, said Nilmini Silva-Send, assistant director of the 
Energy Policy Initiatives Center at the University of San Diego. 
Data gathered from different disciplines is crucial for crafting 
local, state, and federal policies that address climate change 
and its effects, she said. Assessing greenhouse gas emissions, for 
instance, requires information from experts in energy, policy, 
engineering, planning, and law. 

“We need to talk across all disciplines,” she said. 

Boudrias detailed the multidisciplinary work of the Climate 
Education Partners, which brings together a team of climate and 
environmental scientists, educators, social psychologists focused 
on the study of learning, and experts in communications, policy, 
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and the law. The group, of which Boudrias is a principal investi- 
gator, developed a report to help San Diego-area leaders make 
informed decisions in response to climate change and its effects. 

“The science is there,” said Boudrias of the report, called San 
Diego, 2050 Is Calling. How Will We Answer? Scientific evidence 
in the report is bolstered by input from other fields to help 
experts best connect with their audience of local leaders in gov- 
ernment, business, the transportation sector, the public health 
field, the region’s American Indian tribes, and its robust Latino 
community. 

The report is grounded in social psychology concepts that 
empower local decision-makers to take action, Boudrias said. 
The report does more than just convey what scientists expect for 
San Diego’s climate—it provides concrete steps for local leaders 
to “answer the call” and address, according to a community’s 
needs, climate-change effects like dwindling water resources, 
longer wildfire seasons, and coastal flooding. It also uses 
infographics to depict climate-change effects and deliberately 
chosen language, such as presenting temperature increases in 

Fahrenheit, rather than in scientist-approved Celsius. 

The report leverages people’s concern for future genera- 
tions—their top priority, according to the group’s research. This 
perspective also helped to inform the timeline of the report, 
which looks ahead to 2050, rather than hewing to the 100-year 
models that many climate scientists study. Looking toward the 
near-term future taps into people’s concern for the next gen- 
eration, thereby providing a “tighter connection emotionally,” 
Boudrias said. 

According to Tiffany Lohwater of AAAS, the session at the 
Pacific Division meeting presented useful case studies in local 
climate-change communication. “To effectively connect with 
your audience, research demonstrates that it’s important to 
emphasize how climate change affects people here and now, and 
right where they live,’ said Lohwater. As deputy chief commu- 
nications officer and director of AAAS meetings and public en- 
gagement, Lohwater heads up the association’s Communicating 
Science workshops and the Alan I. Leshner Leadership Institute 
for Public Engagement with Science, which last year named 
15 fellows to promote science-society dialogue about climate 
change and to be role models for their peers. “It’s encouraging 
to see AAAS division leaders working to convey the science of 
climate change in their own communities.” 


and model systems, including humans, will be made 


Science 


Immunology 


os 
Rae 


AV AAAS 


456 29 JULY 2016 « VOL 353 ISSUE 6298 


Science Immunology, the latest addition 

to the Science family of journals, unveiled 

its inaugural content on 14 July, featuring 
research that overturned previous think- 

ing about transplant rejection mechanisms, 
insights into how different animal species 
have evolved to recognize and defend against 
an array of elements toxic to the immune sys- 
tem, and much more. A second issue featured 
research that might help bring scientists 
closer to developing novel therapies for dis- 
eases like type 1 diabetes, rheumatoid arthri- 
tis, and multiple sclerosis. Through the end of 
2016, the subscription-based journal, which 
features interdisciplinary research in immu- 
nology, drawing from studies in all organisms 


Published by AAAS 


freely available. Through a partnership with the Feder- 
ation of Clinical Immunology Societies (FOCIS), lead- 
ers of that organization will serve on the new journal’s 
editorial advisory board, and FOCIS members will have 
continued access to the journal. “Innate and adap- 
tive immune cells contribute to diverse inflammatory 
disorders—fibrosis, cardiovascular and metabolic 
diseases, and even neurodegenerative diseases such 
as Alzheimer’s and Parkinson's,” Science Immunology 
Editor Angela C. Colmone wrote in a first-ever edito- 
rial, with Chief Scientific Advisors Abul K. Abbas, M.D., 
and Federica Sallusto. Showcasing immunology stud- 
ies across disciplines and technologies “will encour- 
age collaborative and innovative research,” they wrote. 
Another new journal, Science Robotics, will debut later 
this year. See www.scienceimmunology.org. 
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Innovation competition empowers young entrepreneurs 


Science, technology, and local concerns fuel ideas from the developing world 


By Juan David Romero 


Clarisse Uwineza was only 8 years old when both of her parents were 
killed in the 1994 Rwandan genocide. She was the second-oldest 
child, and it fell on her to look after four younger siblings. Despite 
her heavy burden and all that she had suffered, she managed to 
attend school. In a country where, at least at the time, girls often 

left school before earning a diploma, Uwineza continued to study. 

Today, at age 28, Uwineza has her Bachelor of Science degree in en- 
vironmental chemistry and is the founder/CEO of her own company, 
Environmental Protection and Organics. One of her projects, which 
is aimed at converting bio-waste into clean organic fertilizer, took 
her as far as Stanford University in late June, where she was one of 
29 entrepreneurs from the developing world who showcased start-up 
companies at the sixth Global Innovation through Science and 
Technology (GIST) Tech-I Competition. 
There, she mingled with fellow scientists, 
entrepreneurs, and potential investors 
and received coaching and mentorship, 
while vying for important networking 
relationships and seed capital. 

Uwineza said that participating in 
GIST gave her an amazing opportunity. 

“Tt was my first time in the United 
States, my first time pitching my project 
in front of people, my first time meeting 
hundreds of entrepreneurs and mentors, 
my first time doing what most people 
call networking, my first time in a com- 
petition like this—and I say, wow, thank you, God,’ Uwineza said. 

The GIST initiative was launched in 2011 by the U.S. Department 
of State to provide mentorship and networks to aspiring entrepre- 
neurs from developing nations, to equip them with the tools to 
impact their communities. This year’s GIST Tech-I Competition was 
the third AAAS has administered, coordinating the selection and 
organization of participants and providing experienced mentors to 
work with the young entrepreneurs. 

“At AAAS, we believe the power of curiosity and creativity can 
benefit communities and improve lives everywhere,” said Rush Holt, 
AAAS CEO and executive publisher of the Science family of journals, 
at the competition’s award ceremony, “and the Tech-I program is a 
representation of that.” 

Lisa Brodey, the State Department’s executive director for GIST, 
explained the importance of the program in furthering innovation 
and delivering economic benefits. 

“When young innovators have the skills and mentoring that they 
need, they are more likely to take the risks that can turn ideas into 
startups and ultimately into successful businesses,” she said. 

From year to year, the program has seen increased participation 
among women. Among this year’s 29 finalists, 12 were female. 

The finalists this year were selected from more than 1,000 appli- 
cants from 104 emerging economies. 

“They’re all in different stages,” said Kellye Eversole, a GIST men- 
tor and president of a technology consulting firm. “Some of them 
have not really formed their business plans yet, but the education 
they got [during the 22-24 June competition in Silicon Valley] is 
going to help them do that. Others are ready to launch, and what 
they need is some short-term capital investment to try to do the final 
stages of development before they go to market.” 


Clarisse Uwineza 
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Workshops focus on female entrepreneurs 


Focusing on female innovators’ access to entrepreneurship, a series of 
Women's Village Workshops held this year in Céte d'Ivoire, Mozambique, 
and Nigeria taught strategies for expanding business networks and pro- 
pelling science and technology innovations toward the market. 

Each workshop, organized and managed by AAAS's Research Compet- 
itiveness Program (RCP), brought together 25 local entrepreneurs from 
an array of fields, including information technology, health, telecommuni- 
cations, and agriculture. A majority of the participants were women. 

The program is just one element of the U.S. Department of State's 
Global Innovation through Science and Technology (GIST) initiative, 
which helps young innovators from around the world with startup com- 
panies that tackle economic and development challenges. 

To accelerate the workshop participants’ entrepreneurship, each 
was challenged to reach “60 in 6”: to expand her network by 60 people 
over the course of the next 6 months using the strategies learned at the 
workshop. 

“After the workshop, | put into practice what | have learned, and my 
network continues to grow,’ said Jessyca Esther Houenou, a software 
and web developer who has cofounded an organization to promote Céte 
d'lvoire's natural wealth. Houenou also said that she has shared digital 
marketing techniques from the workshop with other women who were 
interested in reaching new customers. 

Cultivating leadership among participants is a key benefit of the work- 
shops, said Charles Dunlap, director of RCP. 

“We see all of those things as capacity-building—we want to see our 
participants pass on knowledge,’ he said. 
Participant Safoura Fadiga, also of Céte d'Ivoire, said that the work- 

shop she attended was structured to promote collaboration. 

“| liked the communication techniques used,’ she said, explaining that 
participants moved around the room to interact with one another, rather 
than remaining seated. An engineer, teacher, and entrepreneur, Fadiga 
leads IST-DUBASS, a private, French-English bilingual college that aims 
to train more young people—particularly girls—in science, technology, 
engineering, and mathematics. —Andrea Korte 


The contestants were selected based on scores from an expert 
review panel convened by AAAS, who reviewed applicant materials 
including promotional pitch videos. Top scorers went on to a public 
vote. After being selected, the finalists received support to attend the 
2016 Global Entrepreneurship Summit (GES) at Stanford, along with 
the two-day entrepreneurship workshop with successful entrepre- 
neurs, scientists, and investors, which preceded a pitch competition 
offering $70,000 in prize money. 

Charles Dunlap is the program director for the AAAS Research 
Competitiveness Program (RCP), which leads the GIST Tech-I project 
for AAAS. He said that the spirit and goals of entrepreneurship 
should be promoted because they tie into the State Department’s 
diplomatic goals, “and for AAAS, they tie into our goals to see science 
have full impact in society and economic development.” 

For Uwineza, her commitment to helping improve people’s lives is 
stronger than ever. 

“T will do everything until I have great positive impact in society, 
not only in my country, Rwanda, but also in Africa and even the 
entire world,’ she said. 
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K. S. Novoselov,* A. Mishchenko, A. Carvalho, A. H. Castro Neto* 


BACKGROUND: Materials by design is an ap- 
pealing idea that is very hard to realize in 
practice. Combining the best of different in- 
gredients in one ultimate material is a task 
for which we currently have no general solu- 
tion. However, we do have some successful 
examples to draw upon: Composite materials 
and III-V heterostructures have revolution- 
ized many aspects of our lives. Still, we need 
a general strategy to solve the problem of mix- 
ing and matching crystals with different prop- 
erties, creating combinations with predetermined 
attributes and functionalities. 


ADVANCES: Two-dimensional (2D) materials 
offer a platform that allows creation of hetero- 
structures with a variety of properties. One- 
atom-thick crystals now comprise a large family 
of these materials, collectively covering a very 


broad range of properties. The first material 
to be included was graphene, a zero-overlap 
semimetal. The family of 2D crystals has grown 
to includes metals (e.g., NbSe2), semiconduc- 
tors (e.g., MoS,), and insulators [e.g., hexagonal 
boron nitride (hBN)]. Many of these materials 
are stable at ambient conditions, and we have 
come up with strategies for handling those that 
are not. Surprisingly, the properties of such 2D 
materials are often very different from those 
of their 3D counterparts. Furthermore, even 
the study of familiar phenomena (like super- 
conductivity or ferromagnetism) in the 2D case, 
where there is no long-range order, raises many 
thought-provoking questions. 

A plethora of opportunities appear when we 
start to combine several 2D crystals in one 
vertical stack. Held together by van der Waals 
forces (the same forces that hold layered ma- 


Mechanically-assembled stacks 


Production of van der Waals heterostructures. Owing to a large number of 2D crystals available 
today, many functional van der Waals heterostructures can be created. What started with 
mechanically assembled stacks (top) has now evolved to large-scale growth by CVD or physical 


epitaxy (bottom). 
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terials together), such heterostructures allow a 
far greater number of combinations than any 
traditional growth method. As the family of 
2D crystals is expanding day by day, so too is 
the complexity of the heterostructures that 
could be created with atomic precision. 

When stacking different crystals together, 
the synergetic effects become very impor- 
tant. In the first-order approximation, charge 

redistribution might oc- 
cur between the neigh- 
Read the full article boring (and even more 
at http://dx.doi. distant) crystals in the 
org/10.1126/ stack. Neighboring crys- 
science.aac9439 tals can also induce struc- 
a enh catenatum inva: duaiesin eachother 
Furthermore, such changes can be controlled 
by adjusting the relative orientation between 
the individual elements. 

Such heterostructures have already led to 
the observation of numerous exciting physical 
phenomena. Thus, spectrum reconstruction in 
graphene interacting with hBN allowed sev- 
eral groups to study the Hofstadter butterfly 
effect and topological currents in such a sys- 
tem. The possibility of positioning crystals in 
very close (but controlled) proximity to one 
another allows for the study of tunneling and 
drag effects. The use of semiconducting mono- 
layers leads to the creation of optically active 
heterostructures. 

The extended range of functionalities of such 
heterostructures yields a range of possible 
applications. Now the highest-mobility gra- 
phene transistors are achieved by encapsulating 
graphene with hBN. Photovoltaic and light- 
emitting devices have been demonstrated by 
combining optically active semiconducting 
layers and graphene as transparent electrodes. 


OUTLOOK: Currently, most 2D heterostruc- 
tures are composed by direct stacking of in- 
dividual monolayer flakes of different materials. 
Although this method allows ultimate flexibil- 
ity, it is slow and cumbersome. Thus, techni- 
ques involving transfer of large-area crystals 
grown by chemical vapor deposition (CVD), 
direct growth of heterostructures by CVD or 
physical epitaxy, or one-step growth in solution 
are being developed. Currently, we are at the 
same level as we were with graphene 10 years 
ago: plenty of interesting science and un- 
clear prospects for mass production. Given the 
fast progress of graphene technology over 
the past few years, we can expect similar ad- 
vances in the production of the heterostruc- 
tures, making the science and applications 
more achievable. m 


The list of author affiliations is available in the full article online. 
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The physics of two-dimensional (2D) materials and heterostructures based on such 
crystals has been developing extremely fast. With these new materials, truly 2D physics 
has begun to appear (for instance, the absence of long-range order, 2D excitons, 
commensurate-incommensurate transition, etc.). Novel heterostructure devices—such as 
tunneling transistors, resonant tunneling diodes, and light-emitting diodes—are also 
starting to emerge. Composed from individual 2D crystals, such devices use the properties 
of those materials to create functionalities that are not accessible in other heterostructures. 
Here we review the properties of novel 2D crystals and examine how their properties are 


used in new heterostructure devices. 


he family of two-dimensional (2D) mate- 

rials (1) has grown appreciably since the 

first isolation of graphene (2). The emergence 

of each new material brings excitement and 

puzzles, as the properties of these materials 
are usually very different from those of their 3D 
counterparts. Furthermore, 2D materials offer 
great flexibility in terms of tuning their electronic 
properties. Thus, band-gap engineering can be 
carried out by changing the number of layers in 
a given material (3, 4). Even more interesting is 
the specific 2D physics observed in such materials 
[for instance, Kosterlitz-Thouless (KT) behavior, 
characterized by the emergence of topological 
order, resulting from the pairing of vortices and 
antivortices below a critical temperature]. Crystals 
with transition metals in their chemical com- 
position are particularly prone to many-body in- 
stabilities such as superconductivity, charge 
density waves (CDWs), and spin density waves 
(SDWs). Such effects can also be induced by prox- 
imity if such crystals are sandwiched with other 
2D materials. 

Not only do heterostructures of 2D materials 
offer a way to study these phenomena, they also 
present unprecedented possibilities of combining 
them for technological use. Such stacks are very 
different from the traditional 3D semiconductor 
heterostructures, as each layer acts simultaneously 
as the bulk material and the interface, reducing 
the amount of charge displacement within each 
layer. Still, the charge transfers between the layers 
can be very large, inducing large electric fields and 
offering interesting possibilities in band-structure 
engineering. 
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Singapore 117542. 
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Among the tools for band-structure engineer- 
ing in van der Waals heterostructures are the 
relative alignment between the neighboring crys- 
tals, surface reconstruction, charge transfer, and 
proximity effects (when one material can borrow 
the property of another by contact via quantum 
tunneling or by Coulomb interactions). Thus, a 
moiré structure for graphene on hexagonal boron 
nitride (hBN) leads to the formation of secondary 
Dirac points (5-9), commensurate-incommensurate 
transition in the same system leads to surface 
reconstruction (J0) and gap opening in the elec- 
tron spectrum (8), and spin-orbit interaction can 
be enhanced in graphene by neighboring tran- 
sition metal dichalcogenides (TMDCs) (11, 12). 

Here we provide a review of 2D materials, ana- 
lyzing the physics that can be observed in such 
crystals. We discuss how these properties are put 
to use in new heterostructure devices. 


Transition metal dichalcogenides 


Transition metal dichalcogenides, with the formula 
MX, (where M is a transition metal and X is a 
chalcogen), offer a broad range of electronic prop- 
erties, from insulating or semiconducting (e.g., Ti, 
Hf, Zr, Mo, and W dichalcogenides) to metallic 
or semimetallic (V, Nb, and Ta dichalcogenides). 
The different electronic behavior arises from the 
progressive filling of the nonbonding d bands by 
the transition metal electrons. The evolution of 
the electronic density of states (DOS) is shown in 
Fig. 1 [adapted from (13-17)] for the most stable 
phase of each of the dichalcogenides. 

All TMDCs have a hexagonal structure, with 
each monolayer comprising three stacked layers 
(X-M-X). The two most common polytypes of the 
monolayers are trigonal prismatic (e.g., MoS. 
and WS.) and octahedral (e.g., TiS.); these terms 
refer to the coordination of the transition metal 
atom. Inversion symmetry is broken in the former, 
giving rise to piezoelectricity and having impor- 
tant consequences for the electronic structure. In 
addition, many of the tellurides, TcS., ReS., and 


other dichalcogenides adopt lower-symmetry struc- 
tures in which the metal atom is displaced away 
from the center of the coordination unit. 


Metallic TMDCs 


As shown in Fig. 1, the DOS of metallic TMDCs 
has two main properties: (i) The Fermi level of 
the undoped material is always crossing a band 
with d-orbital character, implying that the electrons 
move mostly in the metal layers, and (ii) the DOS at 
the Fermi level is usually quite high, which hints at 
a common explanation for the phase transitions 
observed in these materials (78). 

The interest in these materials comes from the 
existence of CDWs and superconductivity in their 
phase diagrams (19). Whereas the CDW phase has 
clear insulating tendency (opening a gap and sup- 
pressing the DOS at the Fermi level), the su- 
perconducting phase needs finite DOS to exist, 
resulting in a direct competition between the 
two many-body states. This competition leads to 
a complex phase diagram with the presence of 
inhomogeneous electronic and structural patterns, 
which have been observed in electron microscopy 
and neutron scattering in the 3D parent compound. 
Measurements of specific heat and magnetic sus- 
ceptibility in 3D samples have shown partial gap- 
ping of the Fermi surface. In some cases (e.g., TaS.), 
the CDW transition leads to the decoupling of 
the unit cells along the axis perpendicular to the 
planes, with an enormous increase in transverse 
resistivity. 

These unusual properties of metal TMDCs have 
been the subject of intense theoretical debate, but 
no consensus has been reached. The mechanism 
for the CDW transition does not fit standard weak- 
coupling mean field theories such as Fermi surface 
nesting or transitions induced by van Hove sin- 
gularities. Many angle-resolved photoemission 
experiments have been performed in 3D samples 
with contradictory results (20). The existence of 
several Fermi surface sheets and the partial gap- 
ping of the Fermi surface make the theoretical 
interpretation of the experimental data quite dif- 
ficult. Furthermore, the coexistence of CDWs and 
superconductivity (clearly seen in local probes) 
(21) indicates that many-body effects play a very 
important role in these materials. 

Critical information can be obtained from trans- 
port data in these materials, when transport mea- 
surements are performed in conjunction with the 
application of electric and magnetic fields. Ex- 
ternal electric field changes the Fermi energy and 
the carrier concentration in the 2D material, with- 
out the need for chemical doping (which was the 
case in 3D materials and which introduces ap- 
preciable disorder). 

In a recent experiment on 1T-TiSe,, a 2D film 
was encapsulated by hBN and subjected to trans- 
verse electric and magnetic fields (22). By apply- 
ing an external electric field to change the carrier 
density, it was possible to tune the CDW tran- 
sition temperature from 170 to 40 K and, concom- 
itantly, the superconducting transition temperature 
from 0 to 3 K. Controlling the transition temper- 
atures using an electric field allows the critical 
exponents for the phase transition to be determined 
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with high accuracy. Moreover, applying an ex- 
ternal transverse magnetic field at the same time 
reveals novel physical phenomena. associated 
with periodic motion of the Cooper pairs in the 
superconducting phase. Such behavior seems to 
be tied up with the formation of discommensu- 
rations between different CDW domains—namely, 
the electronic system broke down in perfectly or- 
dered superconducting and CDW domains. 


Phase transitions in 2D materials 


Electrons in a solid are characterized by several 
quantum numbers that include charge and spin. 
Due to electron-electron or electron-ion inter- 
actions, electrons can organize themselves in 
phases characterized by an order parameter that 
is associated with these degrees of freedom. In a 
CDW state, as in the case of TMDCs, the order 
parameter is the local electron density p(r), where 
r is the position vector, which orders with a well- 
defined periodicity. This periodicity implies 
that the Fourier transform of the density, p(Q), 
where Q is the so-called ordering wave vector 
of the CDW, acquires a finite expectation value. 
For a CDW, the expectation value of p(Q) is the 
order parameter, which is zero in the disordered 
(or normal) phase and finite in the ordered phase. 
The transition between these phases can be driven 
by external forces such as electric, mechanical, 
and thermal. 

Two-dimensional systems play a particular role 
in the physics of phase transitions. For a system 
with a continuous order parameter, it is not pos- 
sible to have true long-range order in less than 
three dimensions at any finite temperature 7, 
implying that even minute thermal fluctuations 
can destroy order (23). In two dimensions, long- 
range order is possible only at strictly zero tem- 
perature. At T = 0, it is also possible for a system 
to be disordered if one varies an external param- 
eter such as pressure or electric field, FE (Fig. 2). 
The point at which a system becomes ordered at 
T = Ois called the quantum critical point, and the 
transitions are called quantum phase transitions. 
In this case, it is not thermal motion that drives 
the system from order to disorder but quantum 
fluctuations. In this type of transition, the scale 
at which order is created is characterized by a 
correlation length €, which diverges at the phase 
transition as 


S(E) ~ 1/|E - Eel" 


where E£, is the critical field and v is the critical 
exponent. Fluctuations of the order parameter at 
different points in space decay exponentially with 
&. Variations in length scales lead to fluctuations 
in energy scales as well. In a second-order phase 
transition, the characteristic energy scale, A, as- 
sociated with the particular order (that is, the 
energy gap in the system) vanishes at the phase 
transition with another dynamical exponent, 
B, as 


A(E) ~ 1/& ~ |E - Eel” 
The simplest theory for understanding the 


effect of critical fluctuations close to a phase 
transition assumes that the order parameter 
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(SDW, CDW, etc.) couples locally with the 
relevant degree of freedom (spin, charge, etc.). 
The resistivity is then given by the standard de 
Gennes-Friedel formula, in which the electron 
mean free path scales with the differential 
scattering cross section of the order parameter 
fluctuations. 

In a classical phase transition, the behavior is 
driven by thermal fluctuations. The resistivity has 


TMDCs-trigonal prismatic 


T-NbS> 


the same kind of singularity as internal energy, 
implying that the critical behavior of the deriv- 
ative of the resistivity is the same as the specific 
heat at the phase transition. This indicates that 
in a classical phase transition the critical be- 
havior is marked by an inflection point in the 
resistivity at T,. 

Even though, for a 2D system, long-range or- 
der is not possible at any finite temperature, the 
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Fig. 1. Electronic properties of different classes of 2D materials. The Fermi level is set to the zero of 
the energy scale. The DOS is given in states per electron volt per cell. 
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system can undergo a transition to quasi- 
long-range order (KT transition) with 
the presence of vortex-antivortex pairs 
(24). In this case, the order parameter 
correlation length obeys the exponential 
dependence with temperature T 


E(T) ~ a exp(b/|T - Ter|"”), 


where a and b are constants and T(E) 
is the KT transition temperature, which 
is a function of the external tuning param- 
eter E. The resistivity scales with some 
power of the inverse correlation length 
and hence is supposed to have an expo- 
nential dependence with temperature. 


Semiconducting 
group-VIB dichalcogenides 


Because of the charge confinement and 
reduced dielectric screening, the op- 
tical properties of semiconducting 2D materials 
are dominated by excitonic effects. The optical 
spectra of MoSy, one of the most studied TMDCs, 
is characterized by three main transitions, named 
the A, B, and C peaks. The A exciton is the lowest 
energy corresponding to the fundamental optical 
gap of the material. The corresponding exciton 
binding energy is ~1 eV, according to theory. The 
B exciton also corresponds to a transition at the 
K point but for opposite spin. The C peak is of a 
different nature, as it has contributions from ex- 
citons from a large, annular-shaped region of the 
k-space with nearly identical transition energies. 
In nearly neutral monolayer samples, other quasi- 
particles have been observed, including positively 
and negatively charged excitons (ie., trions) and 
biexcitons (25-27). The large trion binding ener- 
gies (20 to 30 meV) have no parallel in traditional 
semiconductors and allow for these quasi-particles 
to be observed even at room temperature. 

The series of Rydberg exciton states above the 
1-s (A) exciton of WS, reveals an exciton series 
that deviates considerably from the hydrogen 
model (28, 29). Not only do the Is, 2s, 3s, ... ns 
states have a closer spacing for small n, reflecting 
a weaker screening at short range (~log7, 
where 7 is the electron-hole separation) (28), they 
also have an entirely different dependence on the 
angular momentum. Ab initio GW calculations 
show that the states in the same shell but with 
higher angular momentum are at lower energy 
levels—that is, 3d, 3p, and 3s are in order of increas- 
ing energy. 

From the technological point of view, however, 
the most relevant transitions are those close to 
the fundamental gap at K(K’) points of the 
Brillouin zone, which can be used for manipulat- 
ing quantum information stored as spin and 
momentum (valley index) of individual electrons, 
holes, or excitons. The selection rule for optical 
transitions is valley-dependent, with the K(K’) 
valley coupling exclusively to right (left) circu- 
larly polarized light. Thus, the valley index, or 
pseudospin, can be controlled coherently by using 
polarized light. Because the two valleys have non- 
zero and symmetrical Berry curvature, in the pres- 
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Fig. 2. Phase diagram for a 2D material with a quantum phase 
transition. 


ence of in-plane electric field they give rise to Hall 
currents with sign depending on the valley index, 
an effect known as the valley Hall effect. The or- 
bital magnetic moment is also valley-dependent, 
which allows for coupling with magnetic fields 
(30-36). 

Quantum dots of TMDCs inherit the valley 
properties of the monolayer and therefore are 
appealing for valleytronics due to the possibility 
of controlling spin and valley states of single con- 
fined electrons or holes—for example, via interac- 
tion with propagating single photons. Quantum 
dots can be created by growing finite islands on a 
monolayer substrate or by applying confinement 
potentials using patterned electrodes. 


Phosphorene and 
group-IV monochalcogenides 


Phosphorene, a monolayer of black phosphorus, 
is a monoelemental 2D material. Monolayer, few- 
layer, and bulk black phosphorus are all semicon- 
ducting materials, with a direct or nearly direct band 
gap (4). Additionally, phosphorene has a very 
high mobility that can reach 1000 cm?/V-s for 
devices of ~10-nm thickness at room temperature 
(37). This exceeds the carrier mobility of TMDCs. 
According to theoretical predictions, the phonon- 
limited hole mobilities can reach 10,000 to 26,000 
cm?/V-s for the monolayer (zigzag direction) (38). 

Both the optical and transport properties of 
phosphorene are highly anisotropic, as a conse- 
quence of this material’s orthorhombic, wavelike 
structure. Optical selection rules dictate that the 
absorption threshold is lower for linear polarized 
light along the armchair direction than along the 
perpendicular direction. Optical conductivity and 
Raman spectra are also anisotropic, providing a 
fast way to determine phosphorene’s lattice ori- 
entation. In addition to its optical and electronic 
properties, fundamental research in phosphor- 
ene has unraveled a growing number of physical 
phenomena, including superconductivity, high 
thermoelectric figure of merit (39), birefringence, 
and colossal ultraviolet (UV) absorption. 

The group-IV monochalcogenides SnS, GeS, 
SnSe, and GeSe are isoelectronic with phosphor- 


ene and share its orthorhombic struc- 
ture, but the two-atom types break the 
inversion symmetry of the monolayer. As 
a consequence, they feature spin-orbit 
splitting (19 to 86 meV) (40) and piezo- 
electricity with large coupling between de- 
formation and polarization change in plane 
(with piezoelectric coefficients e33 ranging 
from 7 x 10°!° to 23 x 10°!° C/m, largely 
exceeding those of MoS, and hBN) (41). 

SnS, SnSe, and GeSe are semiconduc- 
tors, with gap energies covering part of 
the infrared and visible range for differ- 
ent numbers of layers (40). Even though 
the indirect band gap (in most cases) 
makes these materials less attractive for 
optical applications, the existence of two 
pairs of twofold degenerate valence and 
conduction band valleys, each placed on 
one principal axis of the Brillouin zone, 
makes them suitable for valleytronics appli- 
cations. In this case, the symmetry is orthorhombic 
and, thus, the valley manipulation processes are 
different from those for TMDCs. Valley pairs can be 
selected using linear rather than circularly polar- 
ized light. Furthermore, there is no valley Hall 
effect, so the transverse valley current under an 
electric field is a second-order effect. Group-IV 
monochalcogenides are more stable against oxi- 
dation than phosphorene, can be grown by chem- 
ical vapor deposition (CVD), and have been recently 
exfoliated down to their bilayers. 


Gallium and indium monochalcogenides 


GaX and InX (where X is a chalcogen, like S, Se, 
or Te) are additional members of the family of 
hexagonal 2D materials. In this case, each layer 
can be viewed as a double layer of metal M = Ga, 
intercalated between two layers of chalcogen 
(X-M-M-X). The band structure of monolayers 
of such materials is rather unusual, having a 
“Mexican hat” dispersion at the top of the va- 
lence band, leading to a high DOS (42, 43) (bulk 
materials are most probably direct band-gap semi- 
conductors). Thus, these materials have high and 
fast photoresponsibility (44, 45) and large second- 
harmonic generation and have attracted attention 
mostly due to their optical properties. If the Fermi 
level is close to this singularity in p-doped ma- 
terials, a ferromagnetic instability arises (46). 


Hexagonal boron nitride 


Layers of hBN consist of hexagonal rings of al- 
ternating B and N atoms, with strong covalent 
sp’ bonds and a lattice constant nearly identical 
to that of graphite. hBN is very resistant both to 
mechanical manipulation and chemical interac- 
tions and also has a large band gap in the UV 
range. For these reasons, hBN is a material of 
choice as an encapsulating layer or substrate 
for 2D stacked devices, providing an atomically 
smooth surface free of dangling bonds and charge 
traps. hBN substrates leave the band structure of 
graphene near the Dirac point virtually unperturbed 
(if crystallographic orientations of the two crystals 
are misaligned) and dramatically improve the 
mobility of graphene devices (47, 48). 
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Oxide layers and other insulators 

Many oxides have layered structures and can 
therefore be seen as a source for new 2D mate- 
rials. These include lead oxide and lead salts [PbO, 
Pb,O(SO4), NaPbOs, etc.], phosphorus oxides and 
phosphates, molybdenum and vanadium oxides, 
and other transition metal oxides. In these ma- 
terials, the layers are often connected by weak 
covalent bonds, oxygen bridges, or intercalating 
elements and are normally nonstoichiometric 
(due to the presence of oxygen vacancies). Further, 
layered oxides are normally polycrystalline, and 
mechanical exfoliation methods are usually limited 
to those available in higher-quality crystals. For the 
chemical means of production of such mono- 
layers, intercalation with bulky guest species 
(such as tetrabutylammonium ions) has been 
used. Some of these layered oxides have been 
studied due to their importance as battery cath- 
ode materials (e.g., MoOs, V2O;, and other Mo 
and V oxides), superconductors (e.g., copper and 
cobalt layered oxides) (49, 50), passivating layers 
(phosphorus oxide) (57, 52), and other areas of 
technological interest. Layered oxides allow for 
alloying, combination of different layers, and in- 
tercalation of ions and molecules; the possibil- 
ities of materials design are immense. 

Among the most studied 2D insulators are hy- 
brid perovskites, which are noteworthy for their 
high optical absorption coefficient within the 
solar spectrum and strong luminescence. Thin- 
film perovskite-based solar cells have emerged 
with a 20% power conversion efficiency, a no- 
table value for a new technology (53). A hybrid 
perovskite is formed by layers of a metal halide 
intercalated with layers of organic chains. The 
high solar cell efficiency is thought to be greatly 
attributable to the confinement of excitons to the 
layers. Few-layer hybrid perovskites have been 
isolated by mechanical exfoliation and found to 
be stable in air in a time scale of minutes. 


Novel van der Waals heterostructures 


Two-dimensional crystals can be assembled into 
heterostructures (54), where the monolayers are 
held together by van der Waals forces. Consid- 
ering that a large number of 2D crystals is cur- 
rently available, it should be possible to create a 
substantial variety of heterostructures. However, 
the assembly technique currently in use (micro- 
mechanical stacking), allows only certain combi- 
nations of the interfaces. At the same time, an 
alternative technique, which potentially allows 
mass production of such structures (i.e., sequential 
growth of monolayers) comes with its own limi- 
tations and is presently in its infancy. Nevertheless, 
a large variety of novel experiments and proto- 
types have already been carried out with van der 
Waals heterostructures, which indicates that 
these materials are versatile and practical tools 
for future experiments and applications. 


Assembly techniques 


Currently, the most versatile technique for he- 
terostructure assembly is direct mechanical as- 
sembly. This technique flourished starting in 2010 
with Dean et al’s work, which demonstrated the 
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very high performance of graphene devices placed 
on an hBN substrate (47). 

The technique used in the early works is based 
on preparing a flake of 2D crystal (Fig. 3, A to F) 
on a Sacrificial membrane, aligning and placing 
it on another flake, and then removing the mem- 
brane. The process is then repeated to deposit 


further layers. Although the crystals are exposed 
to sacrificial membrane and solvents, which can 
contaminate the interface, annealing allows one 
to remove the contaminants and achieve very 
high interface quality (55), reaching high mobility 
(~10° cm?/V-s) in graphene devices prepared 
this way. 


Fig. 3. Wet-transfer and pick-and-lift techniques for assembly of van der Waals heterostructures. 
(A to F) Wet-transfer technique. A 2D crystal prepared on a double sacrificial layer (A) is lifted on one layer 
by dissolving another (B). The crystal is then aligned (C) and placed (D) on top of another 2D material. 
Upon the removal of the membrane (E), a set of contacts and mesa can be formed (F). This process could 
be repeated to add more layers on top. (G to O) Pick-and-lift technique. A 2D crystal on a membrane [see 
(B)] is aligned (G) and then placed atop another 2D crystal (H). Depending on the relative size of the two 
crystals, it is possible to lift both flakes on the same membrane (1). By repeating the process, it is possible 
to then lift additional crystals [(J) to (L)]. Finally, the whole stack is placed on the crystal, which will serve as 
the substrate [(M) and (N)], and the membrane is dissolved, exposing the entire stack (O). 
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A substantially cleaner method (dubbed the 
“pick-and-lift” method) is based on strong van 
der Waals interactions that exist between the 
crystals. When the membrane with a 2D crystal 
on it is brought into contact with another 2D 
crystal, it is not dissolved but rather is lifted up 
(Fig. 3, G to O); then there is a chance that the 
second crystal will stick to the first and will be 
lifted together with it. The process can be re- 
peated several times. This technique results in 
clean interfaces over large areas and yet higher 
electron mobility (56). Further advances could be 
achieved by transferring the whole process into a 
glovebox with a controllable atmosphere. 


1D contacts 


The later method (Fig. 3, G to O) has one par- 
ticular disadvantage: Having a completely assembled 
stack would prohibit one to make contacts to the 
inner layer. Luckily, it has been demonstrated 
that one can achieve various profiles of the edges 
of such a stack by reactive plasma etching. Thus, 
it is possible to etch the edge of the stack in such 
a way that the desired layer becomes exposed 
and can be contacted by metal evaporation (56) 
(Fig. 4A). The contact resistance for graphene 
can be as low as 35 ohm-um. 


Self-cleansing mechanism 


The transmission electron microscopy (TEM) stud- 
ies (55) demonstrate that interfaces can be atom- 
ically flat and free of any contamination [Fig. 4, B 
to D; adopted from (55)]. The reason for such 
behavior is the so-called “self-cleansing” mecha- 
nism (57). If the affinity between the two 2D 
crystals is larger than the affinity between the 
crystals and the contaminants, then the energet- 
ically favorable situation is when the two crystals 
have the largest possible common interface. To 
achieve this condition, the contaminants are pushed 
away. This explains the observation of bubbles 
under transferred 2D crystals: Those are the 
pockets of contamination pushed from the rest 
of the interface [Fig. 4, E to J; adapted from (57)]. 
This self-cleansing mechanism works only on 
certain pairs of crystals (Fig. 4, E to G). 


Surface reconstruction 


Potentially, the van der Waals interaction be- 
tween two 2D crystals might lead to surface re- 
construction. The most suitable candidates for 
the observation of such effects are crystals with 
similar lattice constants, such as graphene on 
hBN. The lattice constant of hBN is only 1.8% 
larger than that of graphene, which leads to the 
formation of a moiré pattern (5). 

It has been demonstrated that the most favor- 
able configuration for graphene on hBN is when 
boron atoms lay on top of one of the sublattices 
in graphene and nitrogen is situated at the cen- 
ter of the hexagon (58). Then, by stretching itself 
to match the interatomic spacing of hBN, graphene 
tries to increase the area where the favorable 
configuration is achieved. Owing to the high 
Young modulus of graphene, such perfect stack- 
ing cannot be achieved across the whole inter- 
face (unless the hBN can contract, as the loss in 
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elastic energy would not be compensated by the 
gain in the van der Waals interaction). Thus, 
such stretching of graphene can only be local, 
and the stretched regions would be separated 
by areas where the graphene lattice is not com- 
mensurate with hBN (Fig. 4, K and L). 

This effect has been observed for graphene on 
hBN when the crystallographic orientations of 
the two crystals are practically aligned (10). In 
this case, the large regions of the moiré pattern 
where the two crystals are commensurate are 
separated by areas where the graphene lattice is 
relaxed. This effect disappears when the gra- 
phene is misoriented with respect to hBN. Such 
commensurate-incommensurate transition hap- 
pens at a critical angle, which is given by the 
crystal mismatch (10). 

Stacks of several other 2D crystals—including 
MoS, and MoSe, (59), MoS, and WS, (60), fluoro- 


graphene and MoS, (67), and many others—have 
been investigated for electronic properties (62) 
and possible surface reconstruction. Thus, layer- 
breathing phonon modes have been observed by 
means of Raman spectroscopy for MoSe,/MoS., 
heterobilayers (63). However, because the lattice- 
constant mismatch for those pairs is usually 
above 2%, the surface reconstruction would be 
hard to observe. It has been experimentally de- 
tected for silicene on MoSs, where vertical buck- 
ling of silicene allows perfect stacking between 
the two crystals (64). 


Spectrum reconstruction for graphene 
on hBN 


Moiré patterns in graphene on hBN provide pe- 
riodic scattering potential for electrons. This leads 
to the reconstruction of the electronic spectrum in 
graphene at the wave vectors determined by the 
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Fig. 4. Morphology of the van der Waals heterostructures. (A) 1D contacts to van der Waals het- 
erostructures. Etching mesa in van der Waals heterostructures exposes the edges of the crystals inside the 
stack, which allows formation of 1D contacts. Here, carbon atoms are represented by blue spheres, boron 
is shown in yellow, and nitrogen is in purple. (B to D) TEM cross section of a graphene/hBN hetero- 
structure. (B and C) Scanning TEM image (C) of the structure schematically presented in (B). In (B), atom 
coloring is the same as in (A). (D) High-angle annular dark-field image of the same stack. Scale bars in (C) 
and (D), 2 nm. (E to J) Atomic force microscopy (AFM) images of graphene transferred on other 2D 
crystals. A self-cleansing mechanism pushes contamination (hydrocarbons) away from graphene on hBN 
(E), MoSp (F), and WS» (G) interfaces, forcing the contamination to gather in bubbles. Instead, on the 
substrates with poor adhesion to graphene—such as mica (H), BSCCO (I), and V20s (J)—contamination is 
spread uniformly across the whole interface. Images in (E) to (J) are 15 um by 15 um, with az scale of 4 nm. 
(K and L) Local Young modulus for graphene on hBN for misorientation angles of 3° (K) and 0° (L). Note the 
sharp domain walls in (L). Scale bars, 14 nm. (M) Reconstructed electronic energy spectrum for graphene 


aligned on hBN. 
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periodicity of the moiré structure, as has been 
observed in scanning tunneling microscopy (5) 
and, later, in transport (6-8) and capacitance (9) 
measurements. Secondary Dirac points appear 
in the electronic spectrum, in both the valence 
and conduction bands [Fig. 4M; adapted from 
(65)]. The energy range where the spectrum is 
reconstructed is given by the strength of the van 
der Waals interaction between graphene and 
hBN and is estimated to be on the order of 50 meV. 
Furthermore, the surface reconstruction leads 
to the strong asymmetry between the sublattices 
in graphene, which opens a gap in the graphene 
spectrum. 


Capacitively coupled 
van der Waals heterostructures 


Conceptually, the simplest devices based on van 
der Waals heterostructures are those for capac- 
itance measurements. hBN is an ideal insulator 
that can sustain large electric fields (0.5 V per 
layer and above), allowing the preparation of ca- 
pacitors with a very thin dielectric. The use of a 
thin dielectric ensures a large contribution of the 
quantum capacitance, which is directly propor- 
tional to the DOS in the electrode, making ca- 
pacitance measurements a viable tool to study 
both single-particle and interaction phenomena 
in 2D materials. A number of systems have been 
investigated so far, including quantum capacitance 
in graphene (66), various sandwiches of graphene 
with TMDCs (57), and black phosphorus (67). 
Capacitive coupling between two graphene 
layers through a thin layer of hBN can also lead 
to a number of noteworthy phenomena. This 
method allows for very-high-quality Coulomb drag 
devices, where two graphene layers, separated 
galvanically, interact through Coulomb forces 
between the charge carriers in the two layers 
(68). Because it is an atomically flat crystal with a 
very large gap in the electronic spectrum, hBN 
allows very thin barriers (on the order of a few 
nanometers) before any tunneling kicks in, bring- 
ing the two graphene layers closer than the char- 
acteristic distance between electrons in each of 
the layers (10 nm for a characteristic density of 
10” cm”). This opens the new regime of effective 
zero-layer separation in Coulomb drag experiments. 


Tunneling devices 


Graphene can be combined with semiconductor 
and insulating 2D crystals to create a tunnel 
junction (69). The use of hBN as a tunneling 
barrier is particularly attractive due to its large 
band gap (~6 eV), low number of impurity states 
within the barrier, and high breakdown field. 
Because the position of the Fermi energy and the 
DOS in graphene can be varied by external gate, 
the same applies for the tunneling current, which 
allows such structures to be used as field-effect 
tunneling transistors (FETTs) (70). 

The architecture of FETTs enables tunneling 
spectroscopy to probe DOS in graphene, as well 
as to observe impurity- and phonon-assisted tun- 
neling (7/7). Elastic tunneling through impurities 
gives peaks in d//dV,, (J, current; V;,, bias voltage); 
peak positions depend on both bias and gate 
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voltages (Fig. 5C). On the other hand, inelastic 
phonon-assisted tunneling is characterized by 
a set of plateaus in dI/dV,,, independent of gate 
voltage (77) [more pronounced in d’//dV,, (Fig. 
5B)]. When the bias voltage is large enough 
to emit a phonon (eV) = f@pn, where ¢ is the elec- 
tron charge, # is Planck’s constant h divided by 
2x, and @p, is the frequency of the emitted 
phonon), an additional channel opens for elec- 
tron tunneling, which increases transmission 
probability and, hence, tunnel conductance. Tun- 
neling through impurities and with the phonon 
emission is especially visible if the crystallo- 
graphic lattices of the two graphene electrodes 
are strongly misoriented with respect to each other, 
which prohibits direct electron tunneling because 
it is impossible to fulfill the momentum conser- 
vation requirements. 

If the crystallographic lattices of the two graphene 
electrodes are aligned, momentum conservation 
for tunneling electrons can be achieved without 
impurity or phonon scattering. Rotational mis- 
alignment of the two graphene crystals corresponds 
to a relative rotation of the two graphene Brillouin 
zones in the reciprocal space. If the misalignment 
is small enough (<2°), then the momentum dif- 
ference between the electronic states in the top 
and bottom graphene layers can be compensated 
electrostatically by applying bias and gate volt- 
ages (72), leading to the resonant tunneling and 
observation of the negative differential resistance 
(72) (Fig. 5D). A sharp negative differential resist- 
ance feature allows one to build a tunable radio- 
frequency oscillator with the potential to reach 
subterahertz frequencies. 

The highest on-off ratio for FETTs can be 
achieved if the changes in the Fermi energy in 
graphene are comparable with the gap in the 
tunneling barrier—the situation achieved if 
hBN is replaced with WS, (on-off ratio of 10°) 
(73) or MoSsy (on-off ratio of 10° to 10*, probably 
because of the presence of impurity bands) 
(70). In addition to logic applications, tunneling 
in van der Waals heterostructures was exploited 
for memory devices (74) with a floating gate, 
logic circuits (75), radio-frequency oscillators (72), 
and resonant tunneling diodes (76). 


Interaction with light 


Optoelectronic devices based on graphene (77) 
as well as other 2D materials (78) have been 
studied intensively. However, graphene photo- 
detectors typically have low responsivity, which 
is a consequence of low adsorption coefficient. 
Such issues are eliminated when other 2D mate- 
rials are used for such purposes. Thus, TMDCs 
(78), GaS (79), InSe (80), black phosphorus (817), 
and other materials (82) have been used as photo- 
detectors (83) in photodiode or photoconductor 
regimes. The advantages of using such materials 
are the large DOS (which guarantees large opti- 
cal adsorption), the materials’ flexibility, and the 
possibility of local gating, which allows the crea- 
tion of p-n junctions (84). Furthermore, the band 
gap in such materials often depends on the 
number of layers (3), which allows one to control 
the spectral response in such devices. 


Van der Waals heterostructures for 
photovoltaic applications 

Still, even larger opportunities open up when 
such materials are combined. Combinations of 
graphene (as a channel material) and TMDCs 
(as light-sensitive material, where trapped charges 
are controlled by illumination) allow creation of 
simple and efficient phototransistors (85). 

Combining materials with different work func- 
tions can lead to photoexcited electrons and holes 
accumulated in different layers, giving rise to 
indirect excitons [e.g., as has been observed for 
the pairs MoS,/WSe, (86) and MoSe,/WSe, (87) 
(Fig. 5, E and F)]. Such excitons typically have 
long lifetimes, and their binding energy could 
be tuned by controlling the distance between the 
semiconductor layers. 

If p- and n-doped materials are used in such 
devices, then atomically sharp p-n junctions 
can be created (88, 89). Such devices are extreme- 
ly efficient in carrier separation, so they demon- 
strate very high quantum efficiency [for instance, 
GaTe/MoS, devices had external quantum effi- 
ciencies of >60% (88)]. Furthermore, their per- 
formance can be tuned externally by gate voltage, 
as has been demonstrated for black phosphorus/ 
MoS, heterostructures (90). 

Even more efficient photovoltaic devices can 
be created by combining thin layers of TMDCs 
(91) or metal chalcogenides (92) with graphene. 
By sandwiching the photosensitive material be- 
tween graphene electrodes, one can achieve very 
efficient photocarrier extraction from the device 
into graphene electrodes (which typically form 
good ohmic contacts with the TMDCs and serve 
as a transparent electrode as well). Because these 
structures are typically symmetric (Fig. 5G), one 
needs to create an electric field inside the TMDC 
to produce efficient carrier separation by bias, 
external gating (because the electric field is not 
fully screened by graphene due to its low DOS), 
or different doping of the two graphene layers. 


Light-emitting diodes 

The p-n junctions described above can be op- 
erated in the regime of electrical injection of the 
charge carriers, which leads to electron-hole re- 
combination and light emission (89). However, 
such arrangement is limited by the requirements 
of synthesizing p- and n-type materials, which 
have not yet been demonstrated for all 2D crys- 
tals. Furthermore, the resistance of the junction 
is comparable to the resistances of the p and n 
electrodes, which makes it hard to control the 
current distribution. 

Amore straightforward arrangement is the car- 
rier injection from highly conductive transparent 
electrodes directly into the 2D material in a ver- 
tical structure. Such a scheme, however, requires 
careful control of the dwell time of the injected 
electrons and holes in the semiconductor crystal, 
because photoemission is a slow process in com- 
parison with the characteristic time required to 
penetrate the junction between graphene and 
the semiconductor. The dwell time can be con- 
trolled by introducing additional tunnel barriers 
(Fig. 5H). Thus, two to three layers of hBN have 
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been used (93) to increase the time electrons and 
holes spend inside the monolayer TMDC, allow- 
ing their radiative recombination. Devices based 
on WSe, are particularly efficient: Their quan- 
tum efficiency increases with increasing temper- 
ature and injection current, reaching 20% at room 
temperature (94). One can increase the quantum 
efficiency of such structures by placing several 
layers of TMDCs in series (93) (Fig. 51). 


Plasmonic devices 


Plasmons in graphene attract a lot of attention 
because it is possible to tune their frequency by 
changing the carrier concentration and, thus, the 
plasmonic frequency (95). Simultaneously, plas- 
monic and phonon-polaritonic properties have 


Fig. 5. Electronic and A 
optoelectronic applica- 
tions of van der Waals 
heterostructures. (A to 
D) Tunneling in graphene/ 
hBN/graphene tunnel 
transistors. (A) Schematic 
representation of a gra- 
phene tunneling device. 
Graphene electrodes are 
shown in dark purple, and 
the hBN tunneling barrier 
is light blue. The electrodes 
can be aligned with respect 
to each other. (B) d°I/dV,7 
map of phonon-assisted 
tunneling. Color scale: yel- 
low to red corresponds to 
0 to 3.8 x 10° ohm Vt 
(C) di/dV, map of resonant 
tunneling due to the pres- 
ence of impurities in the 
hBN tunnel layer. Color 
scale: yellow to red 
corresponds to O to 7 x 
10-8 ohm. (D) di/dV,, map 
of resonant tunneling with 
momentum conservation 
due to crystallographic 
alignment of two graphene 
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to white to red corresponds 
to -6 x 10° to 0 to 6 x 
10-6 ohm. (E and F) 
Indirect excitons in a MoS2/ 


been studied in other 2D materials. For instance, 
hBN has polar dielectric properties, so it supports 
surface phonon polaritons with very low optical 
losses (96). 

A number of new polaritonic effects can be seen 
in van der Waals heterostructures. Encapsulation 
of graphene with hBN allows one to eliminate the 
scattering of graphene plasmons with impurities, 
increasing their inverse damping ratio by a factor 
of 5 in comparison with bare graphene (97). By 
sandwiching several graphene layers separated by 
hBN spacers, one can hybridize plasmonic modes 
in such multilayers, which can be further con- 
trolled with external gate voltage (98). 

In such heterostructures, it is possible to enter 
a regime where the plasmon polaritons in gra- 
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phene and the phonon polaritons in hBN coexist 
[Fig. 5, J to L; adapted from (99)]. Strong cou- 
pling between the two leads to formation of the 
new collective modes: plasmon-phonon polar- 
itons (100). Both the amplitude and the wave- 
length of the new collective modes can be controlled 
by gating graphene. 

In aligned graphene/hBN heterostructures, the 
formation of the moiré pattern provides further 
modification of the graphene plasmonic spec- 
trum. Zone folding results in the formation of the 
secondary Dirac points (5-8, 65) (Fig. 4M), which 
allows a new type of vertical optical transition. 
Such transitions are immediately reflected in the 
modified damping factor, which exhibits a max- 
imum at such Fermi energies (107). It has also 
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been predicted that new plasmonic modes with 
carrier density dependence characteristic of para- 
bolic electronic bands should appear in the vicinity 
of the van Hove singularities of the reconstructed 
spectrum (107) (Fig. 4M). 


Assembling van der Waals 
heterostructures in liquid and from 
liquid-phase—exfoliated 2D materials 


A very powerful method of preparing graphene, 
which can also be extended to other materials, is 
liquid-phase exfoliation (102). Ink formulation 
based on such suspensions led to the develop- 
ment of graphene-based printed electronics (103). 
However, many applications would strongly ben- 
efit from properties beyond the capabilities of 
graphene inks. Thus, high thermal conductivity 
combined with dielectric property can be de- 
livered by hBN, and optoelectronic capabilities 
can be delivered by inks of 2D semiconductors. 

The ability to print combinations of such ma- 
terials opens the door for low-cost fabrication of 
various devices (104). Planar (105) and vertical 
(106) photovoltaic devices based on TMDCs, as 
well as planar (07) and tunneling transistors (706) 
based on graphene and hBN, have recently been 
demonstrated. 

By solution synthesis of 2D crystals or by con- 
trolling the charge on individual flakes in sus- 
pensions, heterostructures can be formed directly 
in the liquid phase (108) and can be used for 
energy applications. For instance, MoSe,/graphene 
structures have been used for Li-ion battery ap- 
plications (J09). Similar heterostructures have 
also been used for catalytic applications (7/0). 


Growing van der Waals heterostructures 


Direct growth methods such as CVD are prom- 
ising techniques for scalable manufacturing 
of van der Waals heterostructures (717). Such 
techniques can be grouped as follows: (i) se- 
quential CVD growth of 2D crystals on top of 
mechanically transferred or grown 2D mate- 
rials, (ii) direct growth of TMDC heterostructures 
by vapor-solid reactions, and (iii) van der Waals 
epitaxy. State-of-the-art CVD, direct growth, and 
van der Waals epitaxy methods have already 
enabled the growth of many vertical heterostruc- 
tures, such as graphene/hBN (112-116), MoS./ 
graphene (117-120), GaSe/graphene (121), MoS./ 
hBN (122, 123), WS2/hBN (124), MoTe2/MoSy, (125), 
WS,/MoS, (126), VSe/GeSe, (127), MoSe./BiSe; 
(128), MoSe2/HfSesz (129), MoS2/WSe./graphene, 
and WSe,/MoSe,/graphene (76). 

In situ CVD growth of encapsulated graphene 
in a hBN/graphene/hBN heterostructure was an 
important achievement because it demonstrated 
the scalability of high-mobility graphene-based 
field-effect transistors (116). Also, some of the 
TMDC heterostructures can be grown directly in 
a single-step process: A WS./MoS, heterobilayer 
was grown on a SiO,/Si substrate at 850°C from 
precursors (W, S, MoOs) placed in the growth 
tube [Fig. 6A; adapted from (126)]. Because of the 
difference in the growth rates of MoS, and WS,, 
the formation of a Mo,W,_,S. alloy is suppressed. 
A clean interface enabled a band alignment of 
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Fig. 6. Van der Waals epitaxy of vertical and in-plane heterostructures. (A) Schematic of the growth 
process of vertically stacked and in-plane WS2/MoSz heterostructures. (B) False-color dark-field TEM 
image of a suspended hBN/graphene in-plane heterostructure. (© and D) High-resolution scanning 
transmission electron microscopy image and an atomic model of a WSe2/MoSz in-plane heterostructure. 
(E) Schematics of vertical van der Waals heteroepitaxy of graphene on hBN. (F and G) (F) Moiré pattern of 
a graphene/hBN heterostructure as observed in tapping-mode AFM and (G) high-pass-filtered inverse 
fast Fourier transform of the dashed square region in (F). Scale bar in (F), 100 nm. (H) Schematics of the 
growth of MoS2/WSe>/graphene (top) and WSe2/MoSe2/graphene (bottom). Synthesis of both three- 
component heterostructures begins by growing three layers (3L) of epitaxial graphene (EG), followed by 
metal-organic CVD growth of either MoSop (1) or WSes (J). Then, another TMDC layer is grown by vapor 
transfer of either MoSz (K) or WSezo (L). (I to K) AFM images of MoS2/graphene, WSe2/graphene, and 
MoS2/WSe>/graphene vertical heterostructures, respectively. (L) Conductive AFM image of a WSe>/ 
MoSe2/graphene heterostructure. Color scale: black to white corresponds to O to 100 pA. Due to Se-S ion 
exchange, a layer of MoSe2 forms from the original MoSz layer. 
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the two constituent layers, which led to the ob- 
servation of indirect excitons in the WS2/MoS. 
heterostructure. 


Van der Waals epitaxy 


Van der Waals epitaxy was introduced more than 
30 years ago with the growth of a NbSe, mono- 
layer on a cleaved face of MoS, bulk crystal (730). 
This work also led to the successful growth of 
monolayer MoSe, on SnSz (137), as well as growth 
of a two-component heterostructure of monolayer 
NbSe,/trilayer MoSe, on mica (132). 

To grow graphene on hBN, Yang et al. used 
plasma to break down precursor methane mol- 
ecules (48), after which growth occurred at 500°C 
over the course of 2 to 3 hours on hBN crystals 
mechanically exfoliated on a SiO./Si substrate. 
Van der Waals interactions during epitaxial growth 
defined the preferential growth directions so 
that graphene crystals were aligned to the hBN 
substrate [Fig. 6, E to G; adapted from (48)]. 

Mechanically exfoliated hBN has also served 
as a substrate for CVD-based van der Waals epi- 
taxy of a rotationally commensurate MoS./hBN 
heterostructure (722). Another example of van 
der Waals epitaxy is the growth of high-quality 
wafer-scale MoSe,/Bi.Se; heterostructures on the 
low-cost dielectric substrate AIN/Si in ultrahigh 
vacuum conditions (128). 

Graphene is also a good substrate for van der 
Waals epitaxy: Grown WSe./graphene heterostruc- 
tures show an atomically sharp interface and near- 
ly perfect crystallographic orientation between 
graphene and WSeg, despite a large (23%) lattice 
mismatch (133). Few-layer MoS, and hBN struc- 
tures were also grown using epitaxial graphene 
as a growth substrate (117, 134). Recently, mono- 
layers of WSe, and MoS, were grown on free- 
standing CVD graphene (135). TMDC crystals 
were also explored as substrates for epitaxy when 
a MoTe, monolayer was grown on a bulk MoS, 
substrate (125). 

Finally, van der Waals epitaxy can be repeated 
several times to grow complex multicomponent 
heterostructures, such as atomically thin resonant 
tunneling diodes based on MoS,/WSe,/graphene 
and WSe./MoSe,/graphene [Fig. 6, H to L; adapted 
from (76)]. To this end, an epitaxial graphene tri- 
layer was used as a substrate to grow monolayers 
of either MoS, (at 750°C) or WSe, (at 850°C) via 
powder vaporization or metal-organic CVD pro- 
cesses. Subsequently, a second TMDC layer (WSe, 
or MoS.) was grown on top of the initially grown 
heterostructure. 


Lateral heterostructures 


Lateral heterostructures can also be grown by 
avariety of methods. Thus, CVD-grown graphene 
was lithographically patterned and etched away, 
and hBN was grown via CVD, forming lateral 1D 
heterojunctions [Fig. 6B; adapted from (136)]. 
Beyond graphene and hBN, lateral heterostruc- 
tures based on 2D TMDCs can be disruptive for 
integrated optoelectronic devices. Although direct 
growth favors TMDC alloys because of a similar 
chemistry and a small lattice mismatch between 
different TMDCs (137), two-step epitaxial growth 
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of a MoS,/WSe, lateral heterostructure was re- 
cently demonstrated [Fig. 6, C and D; adapted 
from (138)]. To avoid alloying during growth, two 
separate temperature regimes were used (138). 
The atomically sharp WSe./MoS, heterojunction 
has a depletion width of ~300 nm due to the 
potential difference between the MoS, and WSe, 
regions. 

Lateral heterostructures of MoS./WS, and 
WSe,/MoSey were grown directly by control- 
ling the growth temperature at ~650°C (126). 
Growth at relatively low temperatures was fa- 
cilitated by either introducing tellurium into the 
CVD process (126) or using perylene-based growth 
promotors (139). The use of growth-promoting 
perylene-based aromatic molecules was recently 
extended to stitch together largely dissimilar 2D 
materials. 


Conclusion 


The family of 2D crystals is continuously growing, 
both in terms of variety and number of materials, 
and it looks like this process is only beginning. 
Almost every new 2D material possesses unusual 
physical properties. The 2D physics (e.g., KT 
transitions) in such materials is just starting to 
emerge. Still, we argue that the most interesting 
phenomena can be realized in van der Waals 
heterostructures, which now can be mechanical- 
ly assembled or grown by a variety of techniques. 
Among the unsolved problems is the control of 
surface reconstruction, charge transfers, and built- 
in electric fields in such heterostructures. The 
standard band diagrams with quasi-electric fields 
are not a useful concept in 2D heterostructures; 
therefore, a new framework must be developed. 
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combinatorial and cumulative 


genome editing 
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INTRODUCTION: The developmental path by 
which a fertilized egg gives rise to the cells of a 
multicellular organism is termed the cell lineage. 
In 1983, John Sulston and colleagues docu- 
mented the invariant cell lineage of the round- 
worm Caenorhabditis elegans as determined 
by visual observation. However, tracing cell 
lineage in nearly all other multicellular orga- 
nisms is vastly more challenging. Contemporary 
methods rely on genetic markers or somatic 
mutations, but these approaches have limita- 
tions that preclude their application at the level 
of a whole, complex organism. 


RATIONALE: For a technology to comprehensively 
trace cell lineages in a complex multicellular sys- 
tem, it must uniquely and incrementally mark 
cells and their descendants over many divi- 
sions and in a way that does not interfere 
with normal development. These unique marks 
must also accumulate irreversibly over time, 
allowing the reconstruction of lineage trees. 
Finally, the full set of marks must be read out 
from each of many single cells. We hypothe- 
sized that genome editing, which introduces 
diverse, irreversible edits in a highly program- 
mable fashion, could be repurposed for cell 


lineage tracing in a way that realizes these 
characteristics. 

To this end, we developed a method termed 
genome editing of synthetic target arrays for 
lineage tracing (GESTALT). This method uses 
genome editing to generate a combinatorial 
diversity of mutations that accumulate over 
many cell divisions within a compact DNA 
barcode consisting of multiple clustered regu- 
larly interspaced short palindromic repeats 
(CRISPR)/Cas9 target sites. Lineage relation- 
ships can be readily queried by sequencing the 
edited barcodes and relating the patterns of 
edits observed. 


RESULTS: We first developed this approach in 
cell culture, editing synthetic arrays of 9 to 12 
CRISPR/Cas9 target sites to generate thou- 
sands of unique derivative barcodes. We show 
that edited barcodes can be read by targeted 
sequencing of either DNA or RNA. In addition, 
the rates and patterns of barcode editing are 
tunable and the diverse edits accumulate over 
successive divisions in a way that is informative 
of cell lineage. 

We then applied GESTALT to the zebra- 
fish Danio rerio by injecting fertilized eggs 


with editing reagents that target a genomic 
barcode bearing 10 target sites. Across dozens 
of embryos, we demonstrate the accumulation 
of hundreds to thousands of uniquely edited 
barcodes per animal, from which lineage rela- 
tionships can be inferred on the basis of shared 
mutations. In adult zebrafish, we evaluated 
the edited barcodes from ~200,000 cells and 
observed that the majority of cells in each 
organ are derived from a small number of 
progenitor cells. Furthermore, ancestral pro- 
genitors, inferred on the basis of shared mu- 
tations among subsets of cells, can contribute 
to different germ layers and organ systems. 


CONCLUSION: Our proof-of-principle experi- 
ments show that combinatorial, cumulative 
genome editing of a compact barcode can be 

used to record lineage in- 
formation in multicellular 
Read the full article  S¥Stems. Further optimi- 
at http://dx.doi. zation of GESTALT will 
org/10.1126/ enable mapping of the 
science.aaf7907 complete cell lineage in 
diverse organisms. This 
method could also be adapted to link cell 
lineage information to molecular profiles of 
the same cells. In the long term, we envision 
that rich, systematically generated maps of 
organismal development—wherein lineage, 
epigenetic, transcriptional, and positional 
information are concurrently captured at 
single-cell resolution—will advance our under- 
standing of development in both healthy and 
disease states. More broadly, cumulative and 
combinatorial genome editing could stably 
record other types of biological information 
and history in living cells. 
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GESTALT. (Left) A barcode of CRISPR/Cas9 target sites is progressively edited over many cell divisions. (Right) Edited barcode sequences are 
related to one another on the basis of shared mutations in order to reconstruct lineage trees. 
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Multicellular systems develop from single cells through distinct lineages. However, current 
lineage-tracing approaches scale poorly to whole, complex organisms. Here, we use genome 
editing to progressively introduce and accumulate diverse mutations in a DNA barcode over 
multiple rounds of cell division. The barcode, an array of clustered regularly interspaced 
short palindromic repeats (CRISPR)/Cas9 target sites, marks cells and enables the 
elucidation of lineage relationships via the patterns of mutations shared between cells. In 
cell culture and zebrafish, we show that rates and patterns of editing are tunable and that 
thousands of lineage-informative barcode alleles can be generated. By sampling hundreds of 
thousands of cells from individual zebrafish, we find that most cells in adult organs derive 
from relatively few embryonic progenitors. In future analyses, genome editing of synthetic 
target arrays for lineage tracing (GESTALT) can be used to generate large-scale maps of cell 
lineage in multicellular systems for normal development and disease. 


he tracing of cell lineages was pioneered in 
nematodes by Whitman in the 1870s, at a 
time of controversy surrounding Haeckel’s 
theory of recapitulation, which argued that 
embryological development paralleled evo- 
lutionary history (J). This line of work culminated 
a century later in the complete description of 
mitotic divisions in the roundworm Caenorhabditis 
elegans—a tour de force facilitated by its visual 
transparency as well as the modest size and in- 
variant nature of this nematode’s cell lineage (2). 
Over the past century, a variety of creative 
methods have been developed for tracing cell 
lineage in developmentally complex organisms (3). 
In general, subsets of cells are marked and their 
descendants followed as development progresses. 
The ways in which cell marking has been achieved 
include dyes and enzymes (4-6), cross-species trans- 
plantation (7), recombinase-mediated activation 
of reporter gene expression (8, 9), insertion of foreign 
DNA (10-12), and naturally occurring somatic 
mutations (13-15). However, despite many power- 
ful applications, these methods have limitations 
for the large-scale reconstruction of cell lineages 
in multicellular systems. For example, dye and 


Department of Genome Sciences, University of Washington, 
Seattle, WA, USA. “Department of Molecular and Cellular 
Biology, Harvard University, Cambridge, MA, USA. 
3Department of Pathology, University of Washington, Seattle, 
WA, USA. “Center for Brain Science, Harvard University, 
Cambridge, MA, USA. °The Broad Institute of Harvard and 
MIT, Cambridge, MA, USA. °FAS Center for Systems Biology, 
Harvard University, Cambridge, MA, USA. Howard Hughes 
Medical Institute, Seattle, WA, USA. 

*These authors contributed equally to this work. {Corresponding 
author. Email: shendure@uw.edu (J.S.); schier@fas.harvard.edu 
(AF.S.) 


SCIENCE sciencemag.org 


reporter gene-based cell marking are uninformative 
with respect to the lineage relationships between 
descendant cells. Furthermore, when two or more 
cells are independently but equivalently marked, 
the resulting multitude of clades cannot be readily 
distinguished from one another. Although these 
limitations can be overcome in part with combi- 
natorial labeling systems (16, 17) or through the 
introduction of diverse DNA barcodes (10-12), 
these strategies fall short of a system for inferring 
lineage relationships throughout an organism and 
across developmental time. In contrast, methods 
based on somatic mutations have this potential, as 
they can identify lineages and sublineages within 
single organisms (13, 18). However, somatic muta- 
tions are distributed throughout the genome, 
necessitating whole-genome sequencing (14, 15), 
which is expensive to scale beyond small numbers 
of cells and not readily compatible with in situ 
readouts (19, 20). 

What are the requirements for a system for 
comprehensively tracing cell lineages in a complex 
multicellular system? First, it must uniquely and 
incrementally mark cells and their descendants 
over many divisions and in a way that does not 
interfere with normal development. Second, these 
unique marks must accumulate irreversibly over 
time, allowing the reconstruction of lineage trees. 
Finally, the full set of marks must be easily read out 
in each of many single cells. 

We hypothesized that genome editing, which 
introduces diverse, irreversible edits in a highly 
programmable fashion (27), could be repurposed 
for cell lineage tracing in a way that realizes these 
requirements. To this end, we developed genome 
editing of synthetic target arrays for lineage 


tracing (GESTALT), a method that uses clustered 
regularly interspaced short palindromic repeats 
(CRISPR)/Cas9 genome editing to accumulate 
combinatorial sequence diversity to a compact, 
multitarget, densely informative barcode. Edited 
barcodes can be efficiently queried by a single 
sequencing read from each of many single cells 
(Fig. 1A). In both cell culture and in the zebrafish 
Danio rerio, we demonstrate the generation of 
thousands of uniquely edited barcodes that can 
be related to one another to reconstruct cell lineage 
relationships. In adult zebrafish, we observe that 
the majority of cells of each organ are derived from 
asmall number of progenitor cells. Furthermore, 
ancestral progenitors, inferred on the basis of 
shared edits among subsets of derived alleles, 
make highly nonuniform contributions to germ 
layers and organ systems. 


Results 

Combinatorial and cumulative editing of 
a compact genomic barcode in 

cultured cells 


To investigate whether genome editing can be used 
to generate a combinatorial diversity of mutations 
within a compact region, we synthesized a con- 
tiguous array of 10 CRISPR/Cas9 targets separated 
by three base-pair (bp) linkers (total length of 257 bp). 
The first target perfectly matched one single- 
guide RNA (sgRNA), whereas the remainder were 
off-target sites for the same sgRNA, ordered from 
highest to lowest activity (22). This array of targets 
(v1 barcode) was cloned downstream of an enhanced 
green fluorescent protein (EGFP) reporter in a 
lentiviral construct (23). We then transduced 
human embryonic kidney (HEK) 293T cells with 
lentivirus and used fluorescence-activated cell 
sorting (FACS) to purify an EGFP-v1-positive 
population. To edit the barcode, we cotransfected 
these cells with a plasmid expressing Cas9 and the 
sgRNA and a vector expressing Discosoma red 
fluorescent protein (DsRed). Cells were sorted 
3 days after transfection for high DsRed expression, 
and genomic DNA (gDNA) was harvested on day 7. 
The v1 barcode was polymerase chain reaction 
(PCR) amplified, and the resulting amplicons 
were subjected to deep sequencing. 

To minimize confounding sequencing errors, 
which are primarily substitutions, we analyzed 
edited barcodes for only insertion-deletion changes 
relative to the wild-type v1 barcode. In this first 
experiment, we observed 1650 uniquely edited 
barcodes (each observed in >25 reads), with diverse 
edits concentrated at the expected Cas9 cleavage 
sites, predominantly intertarget deletions in- 
volving sites 1, 3, and 5 or focal edits of sites 
1 and 3 (Fig. 1, B and C, and table S1). These 
results show that combinatorial editing of the 
barcode can give rise to a large number of unique 
sequences, i.e., alleles. 

To evaluate reproducibility, we transfected the 
same editing reagents to cultures expanded from 
three independent EGFP-v1-positive clones. Tar- 
geted reverse transcription PCR (RT-PCR) and 
sequencing of EGFP-v1 RNA showed similar dis- 
tributions of edits to the v1 barcode in the transcript 
pool, between replicates as well as in comparison to 
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Fig. 1. GESTALT. (A) An unmodified array of CRISPR/Cas9 target sites (i.e., a 
barcode) is engineered into a genome (gray cell). Editing reagents are introduced 
during expansion of cell culture or in vivo development of an organism, resulting 
in a unique pattern of insertions and deletions (right) that are stably accumu- 
lated in specific lineages (green cell lineage). The lineage relationships of alleles 
that differ in sequence can often be inferred on the basis of these accumulated 
edits. (B) The 25 most frequent alleles from the edited vl barcode are shown. 
Each row corresponds to a unique sequence, with red bars indicating deleted 
regions and blue bars indicating insertion positions. Blue bars begin at the 
insertion site, with their width proportional to the size of the insertion, which will 
rarely obscure immediately adjacent deletions. The number of reads observed 
for each allele is plotted at the right (logl0 scale; the green bar corresponds to 
the unedited allele). The frequency at which each base is deleted (red) or flanks 
an insertion (blue) is plotted at the top. Light gray boxes indicate the location of 
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CRISPR protospacers, and dark gray boxes indicate protospacer adjacent motif 
(PAM) sites. For the v1 array, intertarget deletions involving sites 1, 3, and 5 or focal 
(single target) edits of sites 1 and 3 were observed predominantly. (C) A histogram 
of the size distribution of insertion (top) and deletion (bottom) edits to the v1 array 
is shown. The colors indicate the number of target sites affected. Although most 
edits are short and affect a single target, a substantial proportion of edits are 
intertarget deletions. (D) We tested three array designs in addition to v1, each 
comprising 9 to 10 weaker off-target sites for the same sgRNA (v2 to v4) (22). 
Editing of the v2 array is shown with layout as described in (B). Editing of the 
v3 and V4 arrays is shown in fig. S3, A and B. The weaker sites within these 
alternative designs exhibit lower rates of editing than the v1 array but also a 
much lower proportion of intertarget deletions. (E) A histogram of the size 
distribution of insertion (top) and deletion (bottom) edits to the v2 array is 
shown. In contrast with the v1 array, almost all edits affect only a single target. 


the previous experiment (fig. S1). These results 
show that the observed editing patterns are 
largely independent of the site of integration 
and that edited barcodes can be queried from 
either RNA or DNA. 

To evaluate how editing outcomes vary as a 
function of Cas9 expression, we cotransfected 
EGFP-v1-positive cells with a plasmid expressing 
Cas9 and the sgRNA, as well as a DsRed vector, 
and after 4 days we sorted cells into low, medium, 
and high DsRed bins and harvested gDNA. Overall 
editing rates matched DsRed expression (frequency 
of non-wild-type barcodes: low DsRed = 40%; 
medium DsRed = 69%; high DsRed = 91%). The 
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profile of edits observed remained similar, but 
there were fewer intertarget deletions in the lower 
DsRed bins (fig. S2). These results show that 
adjusting expression levels of editing reagents 
can be used to modify the rates and patterns of 
barcode editing. 

We also synthesized and tested three barcodes 
(v2 to v4) with nine or ten weaker off-target sites 
for the same sgRNA as used for v1 (22). Genome 
editing resulted in derivative barcodes with sub- 
stantially fewer edits than seen with the v1 bar- 
code, but a much greater proportion of these edits 
were to a single target site—i.e., fewer intertarget 
deletions were observed (Fig. 1, D and E, and fig. 


83, A and B). As only a few targets were sub- 
stantially edited in designs vl to v4, we com- 
bined the most highly active targets to a new, 
12-target barcode (v5). This barcode exhibited 
more uniform usage of constituent targets, but 
with relative activities still ranging over two or- 
ders of magnitude (fig. S3C and table S1). These 
results illustrate the potential value of iterative 
barcode design. 

To determine whether the means of editing 
reagent delivery influences patterns of barcode 
editing, we introduced a lentiviral vector expressing 
Cas9 and the same sgRNA to cells containing the v5 
barcode (24). After 2 weeks of culturing a population 
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bottlenecked to 200 cells by FACS, we observed 
diverse barcode alleles, but with substantially 
fewer intertarget deletions than with episomal 
delivery of editing reagents (fig. S3D). This find- 
ing demonstrates that the allelic spectrum can 
also be modulated by the delivery mode of edit- 
ing reagents. 

Taken together, these results show that editing 
multiple target sites within a compact barcode can 
generate a combinatorial diversity of alleles, and 
also that these alleles can be read out by single 
sequencing reads derived from either DNA or 
RNA. Rates and patterns of barcode editing are 
tunable by using targets with different activities, 
and/or off-target sequences, by iteratively recom- 
bining targets to new barcode designs and by mod- 
ulating the concentration and means of delivery 
of editing reagents. 


Reconstruction of lineage relationships 
in cultured cells 


To determine whether GESTALT could be used 
to reconstruct lineage relationships, we applied 
it to a designed lineage in cell culture (Fig. 2). A 
monoclonal population of EGFP-v1-positive cells 
was transfected with editing reagents to induce 
a first round of mutations in the vl barcode. 
Clones derived from single cells were expanded, 
sampled, split, and retransfected with editing 
reagents to induce a second round of mutations 
of the v1 barcode. For each clonal population, 
two 100-cell samples of the re-edited popula- 
tions were expanded and harvested for gDNA. 
In these experiments, we began incorporating 
unique molecular identifiers (UMIs) (10 bp) dur- 
ing amplification of barcodes by a single round 
of polymerase extension (fig. S4A). Each UMI tags 
the single barcode present within each single 
cell, thereby allowing for correction of subsequent 
PCR amplification bias and enabling each UMI- 
barcode combination to be interpreted as de- 
riving from a single cell (25). 

Seven of 12 clonal populations we isolated con- 
tained mutations in the vl barcode that were un- 
ambiguously introduced during the first round of 
editing (Fig. 2A). Additional edits accumulated in 
re-edited cells but generally did not disrupt the 
early edits (Fig. 2B and fig. S5). We next sought 
to reconstruct the lineage relationships between 
all alleles observed in the experiment using a 
maximum parsimony approach (fig. S4B) (26). 
The resultant tree contained major clades that were 
defined by the early edits present in each lineage 
(Fig. 2C). Four clonal populations (nos. 3, 5, 7, and 8) 
were cleanly separated upon lineage reconstruction, 
with >99.7% of cells accurately placed into each 
lineage’s major clade. Two lineages (nos. 1 and 6) 
were mixed because they shared identical muta- 
tions from the first round of editing. These most 
likely represent the recurrence of the same editing 
event across multiple lineages but could also have 
been daughter cells subsequent to a single, early 
editing event prior to isolating clones. Consequently, 
99.9% of cells of these two lineages were assigned to 
asingle clade (Fig. 2C, blue). One clonal population 
(no. 4) appears to have derived from two indepen- 
dent cells, one of which harbored an unedited bar- 
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code. Later editing of these barcodes confounded 
the assignment of this lineage on the tree. Overall, 
however, these results demonstrate that GESTALT 
can be used to capture and reconstruct cell lineage 
relationships in cultured cells. 


Combinatorial and cumulative editing of 
a compact genomic barcode in zebrafish 


To determine the potential of GESTALT for in 
vivo lineage tracing in a complex multicellular 
organism, we turned to the zebrafish D. rerio. 
We designed two new barcodes, v6 and v7, each 
with 10 sgRNA target sites that are absent from 
the zebrafish genome and predicted to be highly 
editable (see the supplementary materials). In 
contrast to v1 to v5, in which the target sites are 
variably editable by one sgRNA, the targets within 
v6 or v7 are designed to be edited by distinct 
sgRNAs. We generated transgenic zebrafish that 
harbor each barcode in the 3’ UTR of DsRed 
driven by the ubiquitin promoter (27, 28) anda 
GFP marker that is expressed in the cardiomyocytes 
of the heart (fig. S6) (29). To evaluate whether 
diverse alleles could be generated by in vivo genome 
editing, we injected Cas9 and 10 different ssRNAs 
with perfect complementarity to the barcode 
target sites into single-cell v6 embryos (Fig. 3A). 
Editing of integrated barcodes had no noticeable 
effects on development (fig. S7). To characterize 
barcode editing in vivo, we extracted gDNA from 
a series of single 30-hours-post-fertilization (hpf) 
embryos, and UMI-tagged, amplified, and sequenced 
the v6 barcode. In control embryos (Cas9 ) (7 = 2), 
all 4488 captured barcodes were unedited. In 
contrast, in edited embryos (Cas9"*) (n = 8), fewer 
than 1% of captured barcodes were unedited. We 
recovered barcodes from hundreds of cells per 
embryo (median 943; range 257 to 2832) and 
identified dozens to hundreds of alleles per embryo 
(median 225; range 86 to 1323). Within single 
embryos 41 + 10% of alleles were observed re- 
currently, most likely reflecting alleles that were 
generated in a progenitor of two or more cells. 
Fewer than 0.01% of alleles were shared in 
pairwise comparisons of embryos, revealing 
the highly stochastic nature of editing in dif- 
ferent embryos. These results demonstrate that 
GESTALT can generate very high allelic diver- 
sity in vivo. 


Reconstruction of lineage relationships 
in embryos 


To evaluate whether lineage relationships can 
be reconstructed using edited barcodes, we fo- 
cused on the v6 embryo with the lowest rates of 
intertarget deletions and edited target sites (Fig. 
3B; avg. 58% + 27% of target sites no longer a 
perfect match to the unedited target, compared 
to 87% + 21% for all other 30 hpf v6 embryos). 
Application of our parsimony approach (fig. S4B) 
to the 1,961 cells in which we observed 1,323 dis- 
tinct alleles generated the large tree shown in 
Fig. 4. 1,307 of the 1,323 (98%) alleles could be 
related to at least one other allele by one or more 
shared edits, 85% by two or more shared edits, 
and 56% by three or more shared edits. These 
results illustrate the principle of using patterns 


of shared edits between distinct barcode alleles 
to reconstruct their lineage relationships in vivo. 


Developmental timing of barcode editing 


To determine the developmental timing of bar- 
code editing, we injected Cas9 and 10 sgRNAs into 
one-cell stage v7 transgenic embryos and harvested 
genomic DNA before gastrulation (dome stage, 
4.3 hpf; n = 10 embryos), after gastrulation (90% 
epiboly/bud stage, 9 hpf; n = 11 embryos), at 
pharyngula stage (30 hpf; m = 12 embryos), and 
from early larvae (72 hpf; 2 = 12 embryos) (Fig. 3A). 
We recovered barcode sequences from a median 
of 8785 cells per embryo (range 461 to 31,640; total 
of 45 embryos), comprising a median of 1223 alleles 
per embryo (range 15 to 4195) (Fig. 3C). Within 
single embryos, 65 + 6% of alleles were observed 
recurrently, whereas in pairwise comparisons of 
embryos only 2 + 5% of alleles were observed 
recurrently. The abundances of alleles were well 
correlated between technical replicates for each 
of two 72-hpf embryos (fig. S8, A and B), and 
alleles containing many edits were more likely 
to be unique to an embryo than those with few 
edits (fig. S8C). To assess when editing begins, 
we analyzed the proportions of the most common 
editing events across all barcodes sequenced in a 
given embryo, reasoning that the earliest edits 
would be the most frequent. Across 8 v6 and 45 v7 
embryos, we never observed an edit that was 
present in 100% of cells. This observation indicates 
that no permanent edits were introduced at the 
one-cell stage. In nearly all embryos, we observed 
that the most common edit is present in >10% of 
cells, and in some cases in ~50% of cells (Fig. 3D 
and fig. S9). This observation also holds in 
~4000-cell dome-stage embryos, which result from 
~12 rounds of largely synchronous division un- 
accompanied by cell death. Most of these edits 
are rare or absent in other embryos, suggesting 
that they are unlikely to have arisen recurrently 
within each lineage. These results suggest that the 
edits present in ~50% of cells were introduced at 
the two-cell stage and that the edits present in >10% 
of cells were introduced before the 16-cell stage. 
How long does barcode editing persist? Two 
aspects of the data suggest that it tapers relatively 
early in development. First, in dome-stage embryos 
(4.3 hpf), we captured barcodes from a median of 
2086 cells, in which a median of 4.8 targets were 
edited. Although the number of cells and alleles 
that we were able to sample increased at the later 
developmental stages, the proportion of edited 
sites appeared relatively stable (Fig. 3C). If editing 
were occurring throughout this time course, we 
would instead expect the proportion of edited 
sites to increase substantially. Second, the number 
of unique alleles appears to saturate early, never 
exceeding 4200 (Fig. 3E). For example, only 4195 
alleles were observed in a '72-hpf embryo in which 
we sampled the highest number of cells (n = 
31,639). These results suggest that the majority 
of editing events occurred before dome stage. 


Editing diversity in adult organs 


To evaluate whether barcodes edited during embryo- 
genesis can be recovered in adults, we dissected 
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Fig. 2. Reconstruction of a synthetic lineage based on genome editing 
and targeted sequencing of edited barcodes. (A) A monoclonal population 
of cells was subjected to editing of the vl array. Single cells were expanded, 
sampled (nos. 1 to 12), retransfected to induce a second round of barcode 
editing, and then expanded and sampled from 100-cell subpopulations (la 
and 1b to 12a and 12b). For clarity, the five clones where the original popu- 
lation was unedited are not shown. (B) Alleles observed in the synthetic 
lineage experiment are shown, with layout as described in the Fig. 1B legend. 
Cell population 1 represents sampling of cells that had been subjected to only 
the first round of editing; virtually all cells contain a shared edit to the first 
target. Populations 1a and 1b are derived from 1 but are subjected to a second 
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round of editing prior to sampling. These retain the edit to the first target, but 
subpopulations bear additional edits to other targets. (C) Maximum parsimony 
reconstruction using PHYLIP Mix (see Materials and Methods and fig. S4B) from 
alleles seen two or more times in the seven cell lineages represented in (A). 
Lineage membership and abundance of each allele are shown on the right. Pro- 
genitor cell lineage 4 (orange) appears to be derived from two cells, one edited 
and the other wild-type. Only 62% of lineage 4 falls into a single clade, consistent 
with the proportion (64%) of the lineage edited after the first round. We assume 
that cells unedited in the first round either accrued edits matching other lineages 
(thus causing mixing) or accrued different edits (thus remaining outside the 
major clades). 


two edited 4-month-old v7 transgenic zebrafish 
(ADR1 and ADR2) (Fig. 5A). We collected organs 
representing all germ layers: the brain and both 
eyes (ectodermal), the intestinal bulb and posterior 
intestine (endodermal), the heart and blood (meso- 
dermal), and the gills (neural crest, with contribu- 
tions from other germ layers). We further divided 
the heart into four samples: a piece of heart tissue, 
dissociated unsorted cells (DHCs), FACS-sorted 
GFP” cardiomyocytes, and noncardiomyocyte heart 


cells (NCs) (fig. S10). We isolated genomic DNA 
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from each sample, amplified and sequenced edited 
barcodes with high technical reproducibility (fig. S11), 
and observed barcode editing rates akin to those 
in embryos (fig. $12). For zebrafish ADR1, we 
captured barcodes from between 776 and 44,239 
cells from each tissue sample (median 17,335), 
corresponding to a total of 197,461 cells and 1138 
alleles. For zebrafish ADR2, we captured barcodes 
from between 84 and 52,984 cells from each tissue 
sample (median 20,973), corresponding to a total 
of 217,763 cells and 2016 alleles. These results 


show that edits introduced to the barcode during 
embryogenesis are inherited through development 
and tissue homeostasis and can be detected in 
adult organs. 


Differential contribution of embryonic 
progenitors to adult organs 


To analyze the contribution of diverse alleles to 
different organs, we compared the frequency of 
edited barcodes within and between organs. We 
first examined blood [of note, zebrafish erythrocytes 
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Fig. 3. Generating combinatorial barcode di- 
versity in transgenic zebrafish. (A) One-cell 
zebrafish embryos were injected with complexed 
Cas9 ribonucleoproteins (RNPs) containing 
sgRNAs that matched each of the 10 targets 
in the array (v6 or v7). Embryos were collected 
at the time points indicated. UMI-tagged bar- 
codes were amplified and sequenced from 


genomic DNA. (B) Patterns of editing in alleles recovered from a 30-hpf v6 embryo, with layout as described in the Fig. 1B legend. (C) Bar plots show the 
number of cells sampled (top), unique alleles observed (middle), and the average number of sites edited (bottom) for 45 v7 embryos collected at four 
developmental time points and two levels of Cas9 RNP (1/3x and 1x). Colors correspond to stages shown in (A). Although more alleles are observed with 
sampling of larger numbers of cells at later time points, the proportion of target sites edited remains relatively constant. (D) Bar plots show the proportion of 
edited barcodes containing the most common editing event in a given embryo. Six of 45 embryos had the most common edit in approximately 50% of cells 
(dashed line), consistent with this edit having occurred at the two-cell stage (see fig. S8A for example). Colors correspond to stages shown in (A). These same 
edits are rarer or absent in other embryos (gray bars below). (E) For each of the 45 v7 embryos, all barcodes observed were sampled without replacement. The 
cumulative number of unique alleles observed as a function of the number of cells sampled is shown (average of the 500 iterations shown per embryo; two 


levels of Cas9 RNP: 1/3x on left, 1x on right). The number of unique alleles observed, even in later developmental stages where we are sampling much larger 


numbers of cells, appears to saturate, and there is no consistent pattern supporting substantially greater diversity in later time points, consistent with the 
bottom row of (C) in supporting the conclusion that the majority of editing occurs before dome stage. 


are nucleated (30)]. Only five alleles defined over 
98% of cells in the ADR1 blood sample (Fig. 5B), 
suggesting highly clonal origins of the adult zebra- 
fish blood system from a few embryonic progen- 
itors. Consistent with the presence of blood in all 
dissected organs, these common blood alleles were 
also observed in all organs (10 to 40%) (Fig. 5C) 
but largely absent from cardiomyocytes isolated 
by flow sorting (0.5%). Furthermore, the relative 
proportions of these five alleles remained largely 
constant in all dissected organs, suggesting that they 
primarily mark the blood and do not substantially 
contribute to nonblood lineages (Fig. 5D). In per- 
forming similar analyses of clonality across all 
organs (while excluding the five most common 
blood alleles), we observed that a small subset 
of alleles dominates each organ (Fig. 5E). Indeed, 
for all dissected organs, fewer than 7 alleles 
comprised >50% of cells (median 4, range 2 to 6), 
and, with the exception of the brain, fewer than 
25 alleles comprised >90% of cells (median 19, 
range 4 to 38). Most of these dominant alleles 
were organ-specific—i.e., although they were 
found rarely in other organs, they tended to be 
dominant in only one organ (Fig. 5F). For exam- 
ple, the most frequent allele observed in the in- 
testinal bulb comprised 13.6% of captured nonblood 
cells observed in that organ but <0.01% of cells 
observed in any other organ. There are exceptions, 
however. For example, one allele is observed in 
24.7% of sorted cardiomyocytes, 13.4% of the 
intestinal bulb, and at lower abundances in all 
other organs. Similar results were observed in 
ADR2 (fig. S13). These results indicate that the 


majority of cells in diverse adult organs are 
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descended from a few differentially edited embry- 
onic precursors. 


Reconstructing lineage relationships in 
adult organs 


To reconstruct the lineage relationships between 
cells both within and across organs on the basis 
of shared edits, we again relied on maximum par- 
simony methods (fig. S4B). The resulting trees for 
ADRI1 and ADR2 are shown in Fig. 6 and fig. S14, 
respectively. We observed clades of alleles that 
shared specific edits. For example, ADR1 had 
eight major clades, each defined by “ancestral” 
edits that are shared by all captured cells as- 
signed to that clade (Fig. 7A; also indicated by 
colors in the tree shown in Fig. 6). Collectively, 
these clades comprised 49% of alleles and 90% 
of the 197,461 cells sampled from ADRI (Fig. 7A). 
Blood was contributed to by three major clades 
(nos. 3, 6, and 7) (Fig. 7B). After reallocating the 
five dominant blood alleles from the composi- 
tion of individual organs back to blood (Fig. 5B 
and fig. S15), we observed that all major clades 
made highly nonuniform contributions across 
organs. For example, clade 3 contributed almost 
exclusively to mesodermal and endodermal or- 
gans, while clade 5 contributed almost exclusive- 
ly to ectodermal organs. These results reveal that 
GESTALT can be used to infer the contributions 
of inferred ancestral progenitors to adult organs. 

Although some ancestral clades appear to con- 
tribute to all germ layers, we find that subclades, 
defined by additional shared edits within a clade, 
exhibit greater specificity. For example, although 
clade 1 contributes substantially to all organs 


except blood, additional edits divide clade 1 into 
three subclades with greater tissue restriction 
(Fig. 7, C and D). The 1+A subclade primarily con- 
tributes to mesendodermal organs (heart and both 
gastrointestinal organs), whereas the 1+C subclade 
primarily contributes to neuroectodermal organs 
(brain, left eye, and gills). Similar patterns are 
observed for clade 2 (Fig. 7, E and F), where the 
2+A subclade contributes primarily to mesen- 
dodermal organs, the 2+B subclade to the heart, 
and the 2+C subclade to neuroectodermal organs. 
Additional edits divide these subclades into further 
tissue-specific sub-subclades. For example, whereas 
the 2+A subclade is predominantly mesendoderm, 
additional edits define 2+A+D (heart, primarily 
cardiomyocytes), 2+A+E (heart and posterior 
intestine), and 2+A+F (intestinal bulb). All of 
the major clades exhibit similar patterns of in- 
creasing restriction with additional edits (Fig. 7, 
C to F, and fig. $16). Similar observations were 
made in fish ADR2 (fig. S17). These results indicate 
that GESTALT can record lineage relationships 
across many cell divisions and capture information 
both before and during tissue restriction. 


Discussion 


We describe a method, GESTALT, which uses 
combinatorial and cumulative genome editing to 
record cell lineage information in a highly multi- 
plexed fashion. We successfully applied this method 
to both artificial lineages (cell culture) as well as to a 
whole organism (zebrafish). Full-tree reconstructions 
for cell culture, zebrafish embryo, and zebrafish adult 
experiments are provided at http://gestalt.gs. 
washington.edu. 
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Fig. 4. Lineage reconstruction of an edited zebrafish embryo. (A) A lineage 
reconstruction of 1323 alleles recovered from the v6 embryo also represented 
in Fig. 3B, generated by a maximum parsimony approach implemented in the 
PHYLIP Mix package (see Materials and Methods and fig. S4B). A dendro- 
gram to the left of each column represents the lineage relationships, and the 
alleles are represented on the right. Each row represents a unique allele. 
Matched colored arrows and dashed lines connect subsections of the tree 
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together. There are many large clades of alleles sharing specific edits, as well 
as subclades defined by “dependent” edits. These dependent edits occur 
within a clade defined by a more frequent edit but are rare or absent elsewhere 
in the tree. (B) A portion of the tree is shown at higher resolution. Two edits 
are shared by all alleles in this clade. Six independent edits define descendant 
subclades within this clade, and further edits define additional sub-subclades 
within the clade. 
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are present in varying 


proportions (10 to 40%) in all intact organs except the FACS-sorted cardio- 
myocyte population (0.5%). All other alleles are summed in gray. (E) The 
cumulative proportion of cells (y axis) represented by the most frequent alleles 
(x axis) for each adult organ of ADRI1 is shown, as well as the adult organs in 
aggregate. In all adult organs except blood, the five dominant blood alleles are 
excluded. All organs exhibit dominance of sampled cells by a small number of 
progenitors, with fewer than seven alleles comprising the majority of cells. For 


The strengths of GESTALT include (i) the com- 
binatorial diversity of mutations that can be 
generated within a dense array of CRISPR/Cas9 
target sites; (ii) the potential for informative mu- 
tations to accumulate across many cell divisions 
and throughout an organism’s developmental 
history; (iii) the ability to scalably query lineage 
information from at least hundreds of thousands 
of cells and with a single sequencing read per 
single cell; and (iv) the likely applicability of 
GESTALT to any organism, from bacteria and 
plants to vertebrates, that allows genome editing, 
as well as human cells (e.g., tumor xenografts). 
Even in organisms in which transgenesis is not 
established, lineage tracing by genome editing 
may be feasible by expressing editing reagents 
to densely mutate an endogenous, nonessential 
genomic sequence. 

Our experiments also highlight several remain- 
ing technical challenges. Chief among these are (i) 
the chance recurrence of identical edits or similar 
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patterns of edits in distantly related cells can 
confound lineage inference; (ii) nonuniform editing 
efficiencies and intertarget deletions within the 
barcode contribute to suboptimal sequence diversity 
and loss of information, respectively; (iii) the 
transient means by which Cas9 and sgRNAs are 
introduced likely restrict editing to early embryo- 
genesis; (iv) the computational challenge of pre- 
cisely defining the multiple editing events that give 
rise to different alleles complicates the unequivocal 
reconstruction of lineage trees; and (v) the difficulty 
of isolating tissues without contamination by 
blood and other cells can hinder the assignment 
of alleles to specific organs. A broader set of chal- 
lenges includes the lack of information about the 
precise anatomical location and exact cell type of 
each queried cell, the fact that genome editing 
events are not directly coupled to the cell cycle, 
and the failure to recover all cells. These chal- 
lenges currently hinder the reconstruction of a 
lineage tree as complete and precise as the one 


comparison, a similar plot for the median embryo (dashed line) from each time 
point of the developmental time course experiment is also shown. (F) The 
distribution of the most prevalent alleles for each organ, after removal of the five 
dominant blood alleles, across all organs. The most prevalent alleles were 
defined as being at >5% abundance in a given organ (median 5 alleles, range 4 
to 7). Organ proportions were normalized by column and colored as shown in the 
legend. Underlying data are presented in table S2. 


that Sulston and colleagues described for C. elegans 
(2). Despite these limitations, our proof-of-principle 
study shows that GESTALT can inform develop- 
mental biology by richly defining lineage relation- 
ships among vast numbers of cells recovered from 
an organism. 

The current challenges highlight the need for 
further optimization of the design of targets and 
arrays, as well as the delivery of editing reagents. 
For example, an array containing twice as many 
targets as used here could fit within a single read 
on contemporary sequencing platforms, thus 
yielding more lineage information per cell without 
sacrificing throughput. Also, as we have shown, 
adjustments to the target sequences and dosages 
of editing reagents can be used to fine-tune muta- 
tion rates and to minimize undesirable intertarget 
deletions. Finally, sgRNA sequences and lengths 
(31), Cas9 cleavage activity and target preferences 
(82, 33), and the means by which Cas9 and sgRNA(s) 
are expressed [e.g., transient, constitutive (34), or 
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Fig. 6. Lineage reconstruction for adult zebrafish ADR1. Unique alleles 
sequenced from adult zebrafish organs can be related to one another using a 
maximum parsimony approach implemented in the PHYLIP Mix package (see 
Materials and Methods and fig. S4B). For reasons of space, we show a tree re- 
constructed from the 601 ADR! alleles observed at least five times in individual 
organs. Eight major clades are displayed with colored nodes, each defined by 


“ancestral” edits that are shared by all alleles assigned to that clade (shown in 
Fig. 7A). Editing patterns in individual alleles are represented as shown pre- 
viously. Alleles observed in multiple organs are plotted on separate lines per 
organ and are connected with stippled branches. Two sets of bars outside the 
alleles identify the organ in which the allele was observed and the proportion of 
cells in that organ represented by that allele (loglO scale). 


induced (35, 36)], can be altered to control the 
pace, temporal window, and tissue(s) at which the 
barcodes are mutated. For example, coupling 
editing to cell cycle progression might enable 
higher resolution reconstruction of lineage rela- 
tionships throughout development. 

Our application of GESTALT to a vertebrate 
model organism, zebrafish, demonstrates its po- 
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tential to yield insights into developmental biol- 
ogy. First, our results suggest that relatively few 
embryonic progenitor cells give rise to the majority 
of cells of many adult zebrafish organs, reminiscent 
of clonal dominance (37, 38). For example, only 
5 of the 1138 alleles observed in ADR1 gave rise 
to >98% of blood cells, and for all dissected 
organs, fewer than 7 alleles comprised >50% of 


cells. There are several mechanisms by which 
such dominance can emerge—e.g., by uneven 
starting populations in the embryo, drift, compe- 
tition, interference, unequal cell proliferation or 
death, or a combination of these mechanisms 
(39-42). Controlling the temporal and spatial in- 
duction of edits and isolating defined cell types 
from diverse organs should help resolve the 
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Fig. 7. Clades and subclades corresponding to inferred progenitors exhib- 
it increasing levels of organ restriction. (A) (Top) The parsimony-inferred 
ancestral edits that define eight major clades of ADR1 are shown, with the 
total number of cells in which these are observed indicated on the right. 
(Bottom) Contributions of the eight major clades to all cells or all alleles. 
Nineteen alleles (out of 1138 total) that contained ancestral edits from more 
than one clade were excluded from assignment to any clade and from any 
further lineage analysis. (B) Contributions of each of the eight major clades to 
each organ, displayed as a proportion of each organ. To accurately display the 
contributions of the eight major clades to each organ, we first reassigned the 
five dominant blood alleles from other organs back to the blood. The total 
number of cells and alleles within a given major clade are listed below. The 
clade contributions of all clades and subclades are presented in table S3. For 
heart subsamples: piece of heart, a piece of heart tissue; DHCs, dissociated 
unsorted cells; cardiomyocytes, FACS-sorted GFP* cardiomyocytes; and NCs, 
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noncardiomyocyte heart cells. (C and E) Edits that define subclades of clade 1 (C) 
and clade 2 (E), with the total number of cells in which these are observed 
indicated on the right. A gray box indicates an unedited site or sites, dis- 
tinguishing it from related alleles that contain an edit at this location. (D and F) 
Lineage trees corresponding to subclades of clade 1 (D) and clade 2 (F) that 
show how dependent edits are associated with increasing lineage restriction. 
The pie chart at each node indicates the organ distribution within a clade or 
subclade. Ratios of cell proportions are plotted, a normalization that accounts 
for differential depth of sampling between organs. Labels in the center of each 
pie chart correspond to the subclade labels in (C) and (E). Alleles present in a 
clade but not assigned to a descendant subclade (either they have no add- 
itional lineage restriction or are at low abundance) are not plotted for clarity. 
The number of cells (and the number of unique alleles) are also listed, and 
terminal nodes also list major organ restriction(s), i.e., those comprising 
>25% of a subclade by proportion. 


mechanisms by which different embryonic pro- 
genitors come to dominate different adult organs. 

Second, we show that GESTALT can inform 
the lineage relationships among thousands of 
differentiated cells. For example, following the 
accumulation of edits from ancestral to more com- 
plex reveals the progressive restriction of pro- 
genitors to germ layers and then organs. Cells 
within an organ can both share and differ in their 
alleles, revealing additional information about 
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organ development. Future studies will need to 
determine whether such lineages reflect distinct 
cell fates (e.g., blood sublineages or neuronal sub- 
populations), because the anatomical resolution 
at which we queried alleles was restricted to 
grossly dissected organs and tissues. Because 
edited barcodes are expressed as RNA, we envision 
that combining our system with other platforms 
will permit much greater levels of anatomical res- 
olution without sacrificing throughput. For ex- 


ample, in situ RNA sequencing (RNA-seq) of 
barcodes would provide explicit spatial and his- 
tological context to lineage reconstructions (19, 20). 
Also, capturing richly informative lineage markers 
in single-cell RNA-seq or assay for transposase- 
accessible chromatin (ATAC)-seq data sets may 
inform the interpretation of those molecular 
phenotypes, while also adding cell type resolu- 
tion to studies of lineage (43, 44). Such integra- 
tion may be particularly relevant to efforts to 
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build comprehensive atlases of cell types. Be- 
cause these single-cell methods generate many 
reads per single cell, this would also facilitate 
using multiple, unlinked target arrays. In prin- 
ciple, the combined diversity of the barcodes 
queried from single cells could be engineered to 
uniquely identify every cell in a complex orga- 
nism. In addition, orthogonal imaging-based lin- 
eage tracing approaches in fixed and live samples 
[e.g., Brainbow and related methods (6, 29)] and 
longitudinal whole-animal imaging approaches 
(45, 46) might be used in parallel to validate and 
complement lineages resolved by GESTALT. 

Although further work is required to optimize 
GESTALT toward enabling spatiotemporally com- 
plete maps of cell lineage, our proof-of-principle 
experiments show that using multiplex in vivo 
genome editing to record lineage information to a 
compact barcode at an organism-wide scale will 
be a powerful tool for developmental biology. This 
approach is not limited to normal development 
but can also be applied to animal models of de- 
velopmental disorders, as well as to investigate 
the origins and progression of cancer. Our study 
also supports the notion that, although its most 
widespread application has been to modify en- 
dogenous biological circuits, genome editing can 
also be used to stably record biological informa- 
tion (47), analogous to recombinase-based mem- 
ories but with considerably greater flexibility and 
scalability. For example, coupling editing acti- 
vity to external stimuli or physiological changes 
could record the history of exposure to intrinsic 
or extrinsic signals. In the long term, we envision 
that rich, systematically generated maps of or- 
ganismal development, wherein lineage, epige- 
netic, transcriptional, and positional information 
are concurrently captured at single-cell resolution, 
will advance our understanding of normal devel- 
opment, inherited diseases, and cancer. 
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Molecular recordings by directed 
CRISPR spacer acquisition 


Seth L. Shipman,* Jeff Nivala,* Jeffrey D. Macklis, George M. Churcht 


INTRODUCTION: Although recent advances 
in DNA synthesis and sequencing technolo- 
gies have made practical the writing and read- 
out of arbitrary data in the form of synthetic 
DNA, still lacking are the robust tools neces- 
sary to generate a dynamic record of such in- 
formation within the genomes of living cells. 
An in vivo system, built out of biological parts 
with large storage capacity, would enable the 
recording of defined biological events into stable 
genetic memory and facilitate the tracking of 
long molecular and cellular histories. 


RATIONALE: The CRISPR (clustered regularly 
interspaced short palindromic repeats)-Cas 
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Mode 1: Spacer sequence 


system is a prokaryotic type of immunological 
memory. Foreign DNA sequences originating 
from viral infections are stored within genome- 
based arrays in the form of short sequences— 
called spacers—that confer sequence-specific 
resistance to the invading nucleic acids. These 
arrays not only preserve the spacer sequences 
but also record the order in which the sequences 
are acquired, generating a temporal record of 
acquisition events. We harnessed this system 
to record arbitrary DNA sequences into a ge- 
nomic CRISPR array in the form of spacers 
acquired from synthetic oligonucleotides elec- 
troporated into a population of cells over- 
expressing the CRISPR adaptation proteins 
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Mode 2: Spacer orientation 


Two modes of encoding information into the CRISPR locus. (A) Oligonucleotides containing an AAG 
PAM and 32 variable bases were electroporated into cells overexpressing Casl-Cas2 and inserted into the 
genomic CRISPR array. Delivery of oligos with distinct sequence over time generates a molecular record. 
(B) Casl-Cas2 mutants identified through directed evolution alter the orientation of acquisition. Varying ex- 
pression ratios of wild-type and mutant Cas1-Cas2 over time generates a record encoded in spacer orientation. 
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Cas1 and Cas2. This enabled the recording of 
defined molecular events into a stable genomic 
locus over time and the storage of arbitrary in- 
formation across a population of cells. 


RESULTS: We show that the Casl-Cas2 com- 
plex can be used in vivo to integrate synthetic 
DNA ofa defined sequence into the Escherichia 
coli genome. We used this feature to examine 

the type I-E CRISPR-Cas 
spacer acquisition process 
Read the full article and optimized the syn- 
at http://dx.doi. thetic spacer design to 
org/10.1126/ achieve higher acquisi- 
science.aafl175 tion efficiency and specif- 
ic integration orientation 
through the addition of an AAG protospacer 
adjacent motif (PAM). We then generated sta- 
ble genomic recordings of multiple molecular 
events by electroporating sets of oligonucleo- 
tides over several days. These molecular records 
were read out with high-throughput sequenc- 
ing and then decoded with a program that 
identified and faithfully reconstructed the 
temporal event order. 

Last, we used directed evolution to gener- 
ate many Casl-Cas2 mutants with modi- 
fied PAM specificity (PAM™*), By modulating 
expression of these mutant and wild-type 
Cas1-Cas2 complexes, we could dynamically 
control the orientation of spacer integration. 
This enabled us to record acquisition events 
in multiple modes. That is, information was 
encoded in both the temporal order of the 
spacers and the orientation in which they 
were integrated. 


CONCLUSION: Our results establish a record- 
ing system that uses the nucleotide content, 
temporal ordering, and orientation of de- 
fined DNA sequences within a CRISPR array 
in order to encode arbitrary information with- 
in the genomes of a population of cells. Be- 
cause information can be encoded in spacer 
nucleotide space (up to two bits per base) and 
in alternate modes, the system has the po- 
tential to record and permanently store higher 
capacities of information than any other syn- 
thetic biological system to date. This lays the 
foundation for an in vivo recording device 
that could be coupled with diverse molecu- 
lar phenomena and used for applications that 
require tracing of long molecular histories. We 
also demonstrate that delivery of synthetic 
DNA substrates to a CRISPR-Cas adaptation 
system in vivo is a practical method to probe 
and adapt the system. 


The list of author affiliations is available in the full article online. 
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Molecular recordings by directed 
CRISPR spacer acquisition 


Seth L. Shipman,”””?* Jeff Nivala,”** Jeffrey D. Macklis,” George M. Church”?+ 


The ability to write a stable record of identified molecular events into a specific genomic 
locus would enable the examination of long cellular histories and have many applications, 
ranging from developmental biology to synthetic devices. We show that the type I-E 
CRISPR (clustered regularly interspaced short palindromic repeats)—Cas system of 
Escherichia coli can mediate acquisition of defined pieces of synthetic DNA. We 
harnessed this feature to generate records of specific DNA sequences into a population of 
bacterial genomes. We then applied directed evolution so as to alter the recognition of a 
protospacer adjacent motif by the Cas1-Cas2 complex, which enabled recording in two 
modes simultaneously. We used this system to reveal aspects of spacer acquisition, 
fundamental to the CRISPR-Cas adaptation process. These results lay the foundations of a 


multimodal intracellular recording device. 


NA has the potential to encode, preserve, 

and propagate information (J). The pre- 

cipitous drop in DNA sequencing cost has 

now made it practical to read out this in- 

formation with high throughput (2). How- 
ever, the ability to write arbitrary information 
into DNA, in particular within the genomes of 
living cells, has been restrained by a lack of bio- 
logically compatible recording systems that can 
exploit anything close to the full encoding capa- 
city of nucleic acid space. 

A number of approaches aimed at recording 
information within cells have been explored (3). 
These systems can be broadly divided into those 
that alter transcription through feedback loops 
and toggles (4-14) and those that encode infor- 
mation permanently into the genome, most often 
using recombinases to store information via the 
orientation of DNA segments (15-19). Although 
the majority of these systems are effectively bi- 
nary, efforts have also been made toward analog 
recording systems (20) and digital counters (27). 
Despite these efforts, the recording and genetic 
storage of little more than a single byte of infor- 
mation (78) has remained out of reach. 

Immunological memory is essential to an or- 
ganism’s adaptive immune response and hence 
must be an efficient and robust form of recording 
molecular events in living cells. The CRISPR-Cas 
system is a recently understood form of adaptive 
immunity used by bacteria and archaea (22). This 
system records past infections by storing short 
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sequences of viral DNA within a genomic array. 
These acquired sequences are referred to as pro- 
tospacers in their native viral context and as 
spacers once they are inserted into the CRISPR 
(clustered regularly interspaced short palindromic 
repeats) array. New spacers are integrated into 
the CRISPR array ahead of older spacers (23). 
Over time, a long record of spacer sequences can 
be stored in the genomic array, arranged in the 
order in which they were acquired. Thus, the CRISPR 
array functions as a high-capacity temporal mem- 
ory bank of invading nucleic acids. 

We harnessed the CRISPR-Cas system to record 
specific and arbitrary DNA sequences into a bac- 
terial genome. We could generate a record of de- 
fined sequences, recorded over many days and in 
multiple modalities. In exploring this system, we 
also elucidated fundamental aspects of native 
CRISPR-Cas spacer acquisition and leveraged 
this knowledge to enhance the recording system. 


A type I-E CRISPR-Cas system accepts 
synthetic spacers in vivo 


Overexpression of the Escherichia coli type I-E 
CRISPR-Cas proteins Cas1 and Cas2 is sufficient 
to drive acquisition of new spacers in a strain con- 
taining two genomic CRISPR arrays but lacking 
endogenous Cas proteins (BL21-AlI) (23). We rep- 
licated this result (Fig. LA) and similarly found 
that new spacers were consistently integrated into 
the first position of array I directly adjacent to the 
leader with a consistent size of 33 bases (fig. S1, A 
and B). These spacers were drawn in roughly 
equal number from the cell’s own genome and 
from the plasmid used to overexpress Casl and 
Cas2 (Fig. 1B). Considering the overall DNA con- 
tent of the cell, this ratio of genome-to-plasmid- 
derived spacers represents a substantial bias toward 
the plasmid as a protospacer source (24). Despite 
this bias, new spacers were drawn from a diverse 
range of sites around the genome and plasmid 


(Fig. 1C) and, besides the overrepresentation 
of a 5’ AAG protospacer adjacent motif (PAM), 
there was no way to predict a priori the full se- 
quence of a new spacer without sequencing the 
expanded array. 

To extend the function of the CRISPR acqui- 
sition system into a synthetic device for record- 
ing molecular events, it is necessary to direct the 
system to capture spacers of specific, defined 
sequence. In vitro, Cas1 and Cas2 can mediate 
integration of synthetic 33-base pair (bp) DNA 
oligos into plasmid-based arrays (25). We reasoned 
that similarly supplying an exogenous source of 
protospacers to the system within a cell might 
direct sequence-specific spacer acquisition in vivo. 
We therefore passaged an overnight culture of 
E. coli BL21-AI containing arabinose- and isopropyl 
B-D-1-thiogalactopyranoside (IPTG)-inducible Cas1 
and Cas2 genes with or without arabinose and 
IPTG for 2 hours. We then electroporated the 
cells with a complementary pair of 33-base oligos 
(protospacer ps33), which matched the sequence 
of the most abundant M13-derived spacer found 
after phage infection of a native type I-E system 
(26). After incubating the cells for another 2 hours 
after transformation, we checked the genomic 
array for expansion and specific integration of the 
synthetic protospacer into the array by means of 
polymerase chain reaction (PCR) (Fig. 1D). By using 
the reverse sequence of the supplied oligo as the 
reverse primer, we also observed amplification of 
specifically sized PCR products that confirmed 
acquisition of the oligo-supplied sequence when 
Cas1 and Cas2 were induced or (more weakly) 
uninduced, but never for the case in which the 
oligos were not supplied. We confirmed that the 
specific ps33 nucleotide sequence was present 
within a fraction of the expanded arrays by means 
of Sanger sequencing. These results demonstrate 
that the CRISPR-Cas system acquired a sequence- 
specific spacer. 

To better understand both the properties of this 
synthetic system as well as the fundamental 
properties of Cas1-Cas2-mediated spacer acqui- 
sition, we altered the oligos that we provided via 
electroporation. The system required both com- 
plementary strands for acquisition, and the double- 
stranded protospacer could insert in either direction 
(Fig. IE). We modified the 5’ ends of the oligos with 
phosphorothioate bonds to help resist degra- 
dation by cellular nucleases but found no differ- 
ences in acquisition efficiency (Fig. 1E). We tested 
whether RNA could serve as a protospacer by 
supplying either one or both of the oligo strands as 
RNA but detected no sequence-specific integra- 
tion of RNA oligos (fig. SID). 

To investigate these results more quantitatively, 
we performed a PCR across the array (as in Fig. 
1D) and subjected the resulting amplicon to high- 
throughput sequencing on an Illumina MiSeq 
platform. We quantified the percentage of all 
arrays that were expanded at the completion of 
an experiment, as well as the spacer source. Coupled 
with quantitative PCR, we generated a time course 
of spacer acquisition (Fig. IF). Sequence-specific 
acquisitions occurred as early as 20 min after elec- 
troporation, reaching ~4% of all arrays by 2 hours. 
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The oligo concentration required to achieve spacer 
acquisition was determined by testing a twofold 
dilution series (Fig. 1G and fig. SIE). Whether 
oligos were delivered or acquired as spacers had 
no effect on the genome- or plasmid-derived spacers. 
Thus, protospacer availability in the cell may be a 
limiting factor in spacer acquisition. On the 
other hand, the addition of an additional CRISPR 
array on the expression plasmid had little to no 
effect on the acquisition frequency of new spacers 
into the endogenous genomic array (Fig. 1G). Like 
genome- and plasmid-derived spacers, the syn- 
thetic spacers were inserted into the first (or 
occasionally first and second) positions of the 
array, and the great majority were of 33 bases 
(Fig. 1, H and I). Loss of previously acquired 
spacers has been reported both in the presence 
(27, 28) and absence (29, 30) of selective pressure. 
Although our analysis was restricted to the leader- 
proximal spacers, we did find rare instances in 


which the previous first spacer was deleted (0.096% 
of arrays sequenced +0.012 SEM). 


PAMs modify the efficiency and 
directionality of spacer acquisition 


Data from sequencing millions of expanded 
arrays showed that genome- and plasmid-derived 
protospacers were drawn in equivalent numbers 
from the forward and reverse strands overall, with 
the only apparent bias being toward the genomic 
origin of replication (Fig. 2A). Similarly, oligo- 
derived protospacers were found in equal pro- 
portions in the forward and reverse orientation 
in the array (Fig. 2B). When we further examined 
the context of the genomic- and plasmid-derived 
protospacers, we found strong evidence for a 
PAM on the 5’ end of the protospacer consisting 
of two adenines at positions -2 and -1 from the 
spacer and a strong bias for a guanine as the first 


spacer base (Fig. 2C). This is largely consistent 


with previous characterizations of the E. coli type 
I-E system (31, 32). An interior sequence motif at 
the 3’ end of the spacer termed the acquisition- 
affecting motif (AAM) has also been reported for 
this system (37). We found spacer sequences that 
are consistent with the presence of this interior 
motif, but the frequency of its occurrence is 
minor compared with the 5’ PAM. 

Although there is no bias in forward- or reverse- 
strand-derived protospacers from the genome or 
plasmid on the whole, a sharper picture emerged at 
the level of individual nucleotides. For example, 
examining one small stretch of the plasmid (~550 
bases), asymmetric peaks of spacer coverage—that 
is, the cumulative count of each time a given nu- 
cleotide was observed within an acquired spacer— 
emerged (Fig. 2D). Plotting the forward and 
reverse PAMs along the same stretch of plasmid 
revealed that in addition to biasing toward spe- 
cific sequences for acquisition, the PAM also 
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Fig. 1. Acquisition of synthetic spacers. (A) Schematic of the minimal 
elements of the type I-E CRISPR acquisition system used, including Cas1, Cas2, 
and array with leader (L), repeat (R), and spacer (S) along with PCR detection 
of an expanded array after the overnight induction of Cas1-Cas2. (B) Origin of 
new spacers (plasmid or genome), mean + SEM. (©) Genome- and plasmid- 
derived spacers after overnight induction are mapped back to the approximate 
location of their protospacer (marked in red). (D) Array expansion (top) and 
specific acquisition of synthetic oligo protospacer (bottom) after electropo- 
ration. Top schematic shows the experimental outline. Schematics under each 
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gel show specific PCR strategy. (E) Sequence-specific acquisition in either the 
forward (top) or reverse (bottom) orientation after electroporation with various 
single- and double-stranded oligos. 5'PT indicates phosphorothioate modifica- 
tions to the oligos at the 5’ ends. (F) Time course of expansion after electro- 
poration, mean + SEM. (G) Percent of arrays expanded by spacer source as a 
function of electroporated oligo concentration, mean + SEM. (H) Position of 
new spacers relative to the leader, mean + SEM. (I) Size of new spacers in base 
pairs, mean + SEM. All gels are representative of =3 biological replicates; * P< 
0.05. Additional statistical details are provided in table S1. 
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Fig. 2. PAMs modify the efficiency and orientation of spacer acquisition. 
(A) Genome-derived (count/10 kb) and plasmid-derived (coverage/base) 
spacers mapped to their protospacer location on the forward (purple) or 
reverse (green) strands. (B) Direction of oligo-derived spacers in the forward 
(purple) or reverse (green) orientation, mean + SEM. (C) Representative se- 
quence pLOGO (46) generated based on 896 distinct genome- and plasmid- 
derived protospacers. Five bases of the protospacer are included at each end of 
the spacer. (D) Plot of the summed spacer coverage mapped to the plasmid 
among three replicates at each nucleotide for a 553-nucleotide stretch. Carrots 
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0 
CACTAGCATAAAGCCCAATTTACTACTCGTTCTGGTGTTTCTCGT 
GTGATCGTAT T TCGGGT TAAATGAT GAGCAAGACCACAAAGAGCA 


ps101C33 


CACTAGCATATCGCCCAATT TACTACTCGTTCTGGTGTTTCTCGT 
_GTGATCGTATAGCGGGTTAAATGATGAGCAAGACCACAAAGAGCA 


demarcate canonical PAMs on the forward (purple) or reverse (green) strand. 
Scale bar, 33 bases. Individual replicates are shown below. (E) Percent of arrays 
expanded by spacer source for different oligo protospacers, mean + SEM. (F) Ratio 
of oligo-derived spacers acquired in the forward versus reverse orientation for 
different oligo protospacers, mean + SEM. (G to J) Normalized representation 
of oligo-derived spacers by base acquired in the forward and reverse direction 
for each oligo. Bars in (I) and (J) are 33 bases long to show dominant and 
minority spacers drawn from the oligo protospacers. For all panels, * P < 0.05. 
Additional statistical details are provided in table S1. 
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specified the orientation of integration into the 
array. Although nearly every protospacer that 
contained a PAM was acquired as a spacer, not 
all were acquired at the same frequency (Fig. 2D). 
The presence of Chi sites—an eight-base motif 
in which double-strand break repair is more likely 
to occur—within a genome or plasmid biases the 
frequency of protospacer acquisitions (24). How- 
ever, we wondered whether the sequence of the 
protospacer itself might also bias acquisition fre- 
quency. We ranked every PAM (AAG)-containing 
potential protospacer in the plasmid according to 
the frequency at which it was acquired into the 
genomic array (fig. S2A). We searched for charac- 
teristics among protospacers, including GC per- 
centage and free energy, that might explain the 
difference in acquisition frequency, but failed to 
identify a correlation (fig. S2, B and C). For a direct 
test, we selected and synthesized three proto- 
spacer sequences (including their 15-bp flanking 
regions): one each from the high (psH), middle 
(psM), and low (psL) end of the frequency spec- 
trum (fig. S2A). We then electroporated each of 
these oligo protospacers into cells expressing 
Cas1-Cas2 from an alternate plasmid that did 
not include these particular sequences. psL was 
acquired much less frequently than psH or psM 
(fig. S2F). To determine whether this was caused 
by the sequence of the spacer itself or a flanking 
region, we swapped the 15-bp flanking regions of 
psH with those of psL and vice versa (psH/L and 
psL/H, respectively). Again, the psL/H spacer was 
acquired at a lower frequency than was psH/L, 
independent of the flanking regions. These re- 
sults indicate that the sequence of the protospacer 
itself influences the efficiency of acquisition. We 
do not know, however, the mechanism of this 
effect, whether by a direct effect on the acquisi- 
tion process itself or by indirect effects such as 
sequence-dependent interactions with endogenous 
nucleotides, competing proteins, or degradation. 
Given that spacers are selected from the ge- 
nome and plasmid according to an adjacent se- 
quence, we wondered whether the inclusion of a 
PAM in our synthetic protospacer ps33 would 
alter acquisition frequency. We designed three 
additional oligo protospacers: psAA33, in which 
two adenines were included at the 5’ end of ps33 
to create the entire canonical AAG PAM; ps10AA33, 
which includes an additional 10 5’ nucleotides; 
and ps10TC33, in which the AA of the PAM was 
mutated to TC to create a noncanonical PAM 
(PAM®*), Using these oligos, we found that the 
inclusion of a PAM greatly increased the effi- 
ciency of sequence-specific acquisition (Fig. 2E). 
Whether preceded by 10 extra nucleotides or not, 
oligos with the AAG PAM (psAA33 and ps10AA33) 
were acquired at greater than five times the frequency 
of those that did not include a PAM (ps33). Con- 
versely, including the TCG PAM did not change 
acquisition frequency relative to ps33 (Fig. 2E). 
In line with what has been previously observed 
for the PAM motif in CRISPR adaptation—that it 
is consistently localized to the leading rather than 
trailing end of the integrated spacer (24, 31, 33-36)— 
the inclusion of a PAM also altered the orientation 
frequency of oligo-derived spacer acquisition. 
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Whereas ps33 and psl10TC33 were acquired equal- 
ly in both orientations, psAA33 and ps10AA33 
were acquired almost exclusively in the forward 
orientation (Fig. 2, F to J, and fig. S3A). Consist- 
ent with the type I-E preference for an AAG 
PAM, psAA33 and psl10AA33 were consistently 
inserted with nucleotide G! as the first base of 
the spacer (Fig. 2, H and I). In contrast, ps10TC33 
lacked a single dominant spacer product and 
was inserted at several different PAMs™ (Fig. 
2J). We verified that both Cas1 and Cas2 were 
necessary for synthetic spacer integration, where- 
as Cas2 nuclease activity was not required (fig. S3, 
B and C) (25). Therefore, the inclusion of a PAM in 
synthetic protospacers dictates both the efficiency 
and orientation of the spacer that is acquired by 
the Casl-Cas2 complex. 


A molecular recording over time 


We tested whether we could harness the acqui- 
sition of specific spacer sequences to record a 
series of synthetic spacers into a population of 
cells over time. As an initial test, we recorded 
three unique elements (1 x 3) into a single cul- 
ture of E. coli by sequentially electroporating a 
series of three different oligo protospacer se- 
quences into the culture, over a period of 3 days 
(one protospacer each day) (fig. S4A). After se- 
quencing a population of the arrays on day 3, we 
could reconstruct the order in which the spacers 
were delivered (fig. S4, B and C, and materials 
and methods). To further probe the limits of this 
system, we recorded 15 distinct elements (3 x 5): 
three sets of five protospacers, electroporated three 
at a time over 5 days (Fig. 3A). The analysis of 
both the 1 x 3 and 3 x 5 recordings are con- 
ceptually similar, so we will discuss the latter in 
detail (fig. S4B and Fig. 3B, respectively). 

For the 3 x 5 recording, all oligo protospacers 
consisted of 35 nucleotides, beginning with a 5’ 
AAG PAM followed by a five-base barcode (specific 
to each of the three sets) and 27 more bases (specific 
to each of the 15 protospacers). At the end of the 
3 x 5 recording, nearly a quarter of all arrays in 
the cell population contained at least one oligo- 
derived spacer, with spacers from each round of 
electroporation represented in roughly equivalent 
proportions (Fig. 3, C and D). Individual variations 
among the spacer acquisition frequency were 
more heavily driven by spacer nucleotide sequence 
than by the round in which they were acquired 
(Fig. 3E), whereas loss of recorded spacers after 
acquisition was rare (0.076% + 0.182 SEM). 

Because of the low probability of acquiring 
spacers from every round in any single array 
(Fig. 3D), successful readout of the recording re- 
quired analysis of a population of arrays. There- 
fore, we sequenced the first three spacers of each 
array (moving in from the leader) and considered 
only the order of pairs of newly acquired spacers 
(Fig. 3B). For any given synthetic spacer pair 
within the same set, the order should follow a 
predictable rule: Among all arrays that contain 
any two new spacers, a spacer electroporated in 
an earlier round will always be found further from 
the leader than a spacer introduced at a later 
round. We also gained information by considering 


the arrangement of oligo-derived spacers in rela- 
tion to newly acquired genome- and plasmid- 
derived spacers. Because the endogenous spacers 
will accumulate over time, synthetic spacers from 
an earlier round will be paired more often with a 
new genome/plasmid spacer in one direction 
(toward the leader) than in the other (relative to 
the synthetic spacer), and vice versa for oligo- 
derived spacers from a later round. With five pos- 
sible spacers (in each set), we considered all 
possible pairwise comparisons and generated 15 
ordering rules from which we can reconstruct 
the order of the entire set (Fig. 3B). We took the 
sequences of arrays after the completion of the 
3 x 5 recording and passed them through an 
algorithm that, with the only sequence-based 
input being the sequence of the CRISPR repeat, 
would predict all oligo-derived spacer sequences, 
assign them to a set according to the barcodes, 
and then test all possible permutations of the 
sequence against the 15 ordering rules. For each 
set, only one permutation satisfied all 15 ordering 
Tules, and in every case, that permutation matched 
the actual order of electroporated oligos (Fig. 3F). 
Although we analyzed ~2 million reads for each 
replicate, we found that order could be correctly 
reconstructed in most cases with 20,000 reads or 
fewer. Thus, we could reliably record and read out 
the 15-element recording. 


Cas1-Cas2 PAM recognition 
can be modified 


The ability to control not only the sequence of 
new spacers but also the orientation of new spacer 
integration would enable recording of information 
in multiple modalities simultaneously. Because the 
addition of a 5’ AAG PAM on our synthetic spacers 
controlled the orientation of new acquisitions 
(Fig. 2F), we sought to modify integration orienta- 
tion by altering PAM recognition of Cas1-Cas2. 
To do this, we performed the directed evolution 
approach shown in Fig. 4A. First, we generated 
a large library of random Cas1-Cas2 mutants by 
means of error-prone PCR (fig. $5, A and B) and 
inserted this library into a plasmid upstream 
of a minimal CRISPR array. After cloning the 
plasmid library into BL21-AI, we induced and 
transformed mutants with a protospacer bearing 
the canonical 5' AAG PAM on the forward strand 
and a noncanonical 5' TCG PAM** on the reverse 
strand. After outgrowth, we selected mutants 
using a forward primer ahead of the Cas1-Cas2 
mutant genes and a reverse primer matching the 
PAM*® spacer sequence in order to yield specific 
amplification of only those mutants that had ac- 
quired the spacer in the (reverse) PAM™ orienta- 
tion. A subset of these selected mutants were 
then tested for PAM specificity, and a separate 
subset were subjected to another round of selec- 
tion for refinement before testing. For testing, 
individually selected mutant clones were in- 
duced overnight, and their expanded arrays were 
analyzed by means of sequencing. Specifically, we 
analyzed the PAMs of the all genome- and plasmid- 
derived spacers to determine what, if any, PAM 
specificity remained. Wild-type Cas1-Cas2 acquires 
spacers from AAG PAM protospacers at nearly the 
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Fig. 3. A molecular recording over time. (A) Experimental outline of the 3 x 
5 recording. Over 5 days, three sets of five oligo protospacers (15 elements) 
were electroporated (one protospacer from each of the three sets each day) 
into cells expressing Cas1-Cas2. Time points at which cells were sampled for 
sequencing are numbered 1 to 6. (B) Schematic illustrating all possible pairwise 
ordering of new spacers. G/P denotes a spacer derived from the genome or 
plasmid. Ordering rules are shown below. In the case of y = z, asterisk indicates 
a tolerance within 20% of the mean of both values. (C) At each of the six 
sample points [marked in (A)], percent of all arrays expanded with synthetic 
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Ordering Rules (15) 


spacers from each of the indicated rounds, mean + SEM. (D) Single, double, 
and triple expansions for each round, mean + SEM. (E) Percent of all expansions 
at sample point six, broken down by electroporation round and set. Open circles 
are individual replicates; filled bars are mean + SEM. (F) Results of ordering rule 
analysis for one replicate across each set. For all 120 permutations, results of the 
tested rule are shown (green indicates pass, red indicates fail). For all sets, only 
one permutation passed all rules and in every case that permutation matched 
the actual order in which the oligos were electroporated (as indicated by check 
mark). Additional statistical details are provided in table S1. 
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same frequency as from all other (non-AAG) PAM 
protospacers combined (Fig. 4B). In contrast, the 
majority of mutants we selected acquired non- 
AAG protospacers at a greater frequency than 
that of AAG protospacers (Fig. 4B). There was no 
gain in non-AAG acquisition frequency from the 
extra step of refinement (fig. S5C), so mutants 
from both subsets are shown together (Fig. 4B 
and fig. S5D). 

To visualize shifts in PAM specificity, we plot- 
ted a heat map showing the normalized frequency 
of observed PAMs among all potential PAMs for 
wild-type Cas1-Cas2 and several selected mutants 


(Fig. 4C). Wild-type Cas1-Cas2 had strong selec- 
tivity for the canonical AAG PAM. A minority of 
mutants also retained (m-24) or even increased 
(m-27) this preference. However, many more 
mutants showed reduced or, in the case of the 
three mutants shown (m-74, m-80, and m-89), 
nearly no specificity for the canonical PAM. From 
the sequence of these selected mutants, we chose 
a subset of single-point mutations for follow-up 
analysis on the basis of repeated observations in 
the data set or location in the crystal structure of 
the CasI-Cas2 complex (Fig. 4E and table $3) (37-39). 
Most of the single-point mutants tested in isola- 


tion also reduced the PAM specificity compared 
with that of wild type (Fig. 4D and fig. S5D). 
These results demonstrate that PAM recognition 
by the Cas1-Cas2 complex can be modified by many 
different mutations without drastically reducing 
spacer acquisition efficiency. 


Recording in a second modality 


As a proof of concept, we selected a PAMN® 
Cas1-Cas2 mutant (m-89) (Fig. 4C and fig. S5D) to 
add an extra modality to the 1 x 3 recording (fig. S4). 
We subjected bacteria to three sequential rounds 
of electroporation, with each oligo protospacer 
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Fig. 4. Directed evolution of PAM recognition. (A) Schematic of the directed 
evolution. (B) Testing of selected mutants, plotting 5’ AAG versus non-AAG 
PAM protospacers normalized to count per 100,000 sequences. Scatter plot 
shows 65 induced mutants (open black circles), three induced wild-type 
replicates (open green circles), an uninduced wild type (open red circle), the 
average of the induced mutants (filled black circle), and the average of the 
induced wild types (filled green circle) + SEM. Scatter plot to the right is an 
inset of the larger plot. (©) Heatmap of protospacer PAM frequency over the 
entire sequence space for wild-type Casl-Cas2 (wt), mutants that increase or 
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maintain AAG PAM specificity (m-27 and m-24), and mutants that lose AAG 
PAM specificity (m-74, m-80, and m-89). Numbers at top right correlate to 
numbers in (B). (D) A subset of selected mutants reassayed in triplicate as well 
as a subset of single-point mutants chosen from the original selection. All points 
are the average of three replicates + SEM. (E) Crystal structure of Casl-Cas2 
complex bound to a protospacer (38). Inset highlights, in purple, residues in the 
Cas1 active site that (when mutated) decrease PAM specificity. The proto- 
spacer PAM complementary sequence (T30 T29 C28, numbering as in PDB 
ID 5DQZ) is also noted. Additional statistical details are provided in table S1. 


sciencemag.org SCIENCE 


Downloaded from http://science.sciencemag.org/ on July 28, 2016 


RESEARCH | RESEARCH ARTICLE 


containing a 5’ AAG PAM on the forward strand 
and a5’ TCG PAM*< on the reverse (Fig. 5A). We 
controlled expression of wild-type Cas1-Cas2 and 
m-89 using different inducible promoters (pLTetO 
and pT7lac, respectively) on the same plasmid 
(Fig. 5B). We split the bacteria between two con- 
ditions, each alternating between T7lac and tet 
induction from round to round. We found that 
cells of both conditions acquired spacers from 
each round at similar frequencies, indicating that 
transcription and integration activity of the wild- 
type and m-89 Cas1-Cas2 were both adequate 
(Fig. 5C). At the completion of the recording, we 
compared the orientation of each spacer between 


A 


Rd1 


PAM — AAGGCATAACAT TGAACAACTGGAGGACTGACGAACGA 


the two conditions. The ratio of forward- to 
reverse-oriented spacers shifted toward PAMN® 
(reverse) during tet induction (Fig. 5, D and F). 
After normalization for the total spacer orientation 
ratio for each spacer, we could clearly discriminate 
which cultures had been exposed to each inducer 
at each time point on the basis of only the di- 
rection of integration (Fig. 5G). Thus, this system 
can simultaneously record in two modalities. 


Discussion 


We developed a CRISPR-Cas-based system to re- 
cord molecular events into a genome in the form 
of essentially arbitrary synthetic DNA sequences. 


Rd2 


PAM — AAGAGTACGTCTGATAGATATCCATTGATTACTCCCGA 
TTCTCATGCAGACTATCTATAGGTAACTAATGAGGGCT <— PAM" 


Although the information is only partially encoded 
within any given cell, the complete record remains 
distributed across a population of cells. To read 
out the recordings, we used high-throughput 
sequencing and only considered the pairwise 
order of any two new spacer sequences within 
single CRISPR arrays. From these many binary 
comparisons, a complete record of events could 
then be assembled, faithfully decoding the dis- 
tributed memory fully preserved within the cell 
population. An important consideration of this 
system is that despite the necessary destruction 
of cells for readout at the end of the recording, 
the encoding process is not destructive. Thus, as 


Rd3 


PAM — AAGTACGTTGGATATAACCAATAACACT CGTTGATCGA 


TTCCGTATTGTAACTTGTTGACCTCCTGACTGCTTGCT <— PAM“ 


TTCATGCAACCTATATTGGTTATTGTGAGCAACTAGCT <—PAM 
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Fig. 5. Recording in an additional mode. (A) Outline of the recording pro- 
cess. Three different synthetic protospacers (each containing a 5’ AAG PAM on 
the forward strand and a 5’ TCG PAM on the reverse) were electroporated over 
3 days (one protospacer each day) into two bacterial cultures under different 
induction conditions (shown below timeline). Sampling time points are 
numbered 1 to 3. (B) Schematic of the plasmid construct used, showing 
wild-type and PAMN® mutant (m-89) Casl-Cas2 driven by independently 
inducible promoters (T/lac and pLtetO, respectively). The heatmap shows 5’ 


SCIENCE sciencemag.org 


PAM specificity for wild type (boxed in yellow) and mutant m-89 (boxed in red). 
(C) At each of the three sample points [marked in (B)], percent of expanded 
arrays with spacers from each of the indicated rounds for the two conditions, 
mean + SEM. (D to F) Ratio of synthetic spacers acquired in the forward 
versus reverse orientation for each round under each condition, mean + SEM. 
(G) Ratio of forward to reverse integrations normalized to the sum of both 
possible orientations for each of the two conditions, mean + SEM. For all 
panels, *P < 0.05. Additional statistical details are provided in table S1. 
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opposed to sequential sampling of a population 
to generate a record of events, the current ap- 
proach does not require that cells be destroyed 
while the experiment is ongoing. Moreover, be- 
cause the recording is distributed across a pop- 
ulation, only a fraction of the population needs 
to be sampled to retrieve the recording. 

We uncovered details of the native CRISPR-Cas 
adaptation system. Integration of synthetic oligo 
sequences in vivo by the Cas1-Cas2 protein com- 
plex enabled us to directly assess detailed aspects 
of protospacer acquisition. Because the frequency 
of spacers acquired from the genome and plasmid 
is largely unaltered in the presence of oligo-derived 
acquisition (Figs. 1G and 2E), we conclude that the 
availability of adequate protospacers is likely one 
limiting aspect of the adaptation system. The pres- 
ence of a 5' AAG PAM modulated both the fre- 
quency and orientation of spacer acquisition, and 
the interior sequence of the protospacer influenced 
acquisition efficiency. 

Directed evolution allowed us to experimen- 
tally modify PAM recognition of the Cas1-Cas2 
complex, which enabled us to generate a record 
in multiple modalities simultaneously. This di- 
rected evolution method required no structural 
information and should be generally applicable 
to evolving other activities of CRISPR-Cas proteins 
by coupling them to the spacer acquisition process 
(for example, modifying target site specificity). 

There are challenges to directly comparing be- 
tween different cellular recording approaches. For 
instance, some are rewritable (4-7, 9-14, 17, 20, 21), 
whereas others, similar to our system, create per- 
manent records (15, 17-21). To date, the highest 
permanent storage capacity of a synthetic in vivo 
recording device was achieved by using 11 orthog- 
onal recombinases, capable of 2" (2,048) distinct 
states, capturing 1.375 bytes of information within 
a single cell (18). In our 3 x 5 recording, we en- 
coded 15 individual elements within a popula- 
tion of cells. However, because this system can 
record arbitrary defined sequences, the number 
of possible states is expanded dramatically. With 
an invariable G at the beginning of the spacer 
and a five-base set identifier, 27 bases remain that 
could encode information, yielding 4°” possible 
distinct sequences per spacer. It was possible to 
encode the order within each set to at least five 
elements, resulting in a specific state capacity 
for each set based on the permutation P(4°75) = 
1.9 x 10°, or 5.7 x 10%! combining the three sets 
and assuming set independence. If we include 
interdependence between each set, total distinct 
states would rise to (47”)" or ~7 x 104°. As a point 
of comparison, the number of atoms in the ob- 
servable universe is estimated at 1 x 10°°. 

Moving from theoretical to practical consid- 
erations, the information capacity of a given re- 
cording in our system depends on the degree to 
which the sequence of the protospacer is con- 
strained. If there are no sequence constraints on 
the protospacer, and thus any arbitrary sequence 
is available, then the 15 recorded spacers (in the 
3 x 5 recording paradigm) each contain 27 bases 
of recording potential at four bases per byte, 
yielding 101.25 bytes per recording. Throughout 
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our experiments, we were able to vary the nu- 
cleotide identity at every one of these 27 posi- 
tions in our oligo protospacers. However, we 
have not explicitly tested, nor is it practical to 
test, all possible protospacers for viability. More- 
over, we have shown that the sequence of the 
protospacer can influence acquisition frequency, 
so it is reasonable to assume that not all possible 
sequences will be suitable protospacers. 

We can set an absolute lower limit on the in- 
formation capacity of the 3 x 5 recording presented 
here by assuming that the particular sequences 
that we used in the recording are the only possible 
sequences that could be used. In that case, we can 
encode information only in the order of the se- 
quences recorded in three sets of five possible 
spacers, disallowing repetition. In this case, the 
bits per set is given by log,[P(5,5)] = ~6.9 bits or 
~2.59 bytes, summing all three sets. 

However, to assume that no other sequences 
are allowable is conservative. For instance, con- 
sidering just the new spacers that were observed 
in this work, there were 48,773 genome-derived, 
186 plasmid-derived, and 23 oligo-derived spacers 
of 33 bases that included an AAG PAM in their 
protospacer context. Using this pool of validated 
sequences in our recording paradigm would yield 
log.[P(48982,5)] = ~77.9 bits per set, or ~29.21 
bytes of potential encoding capacity for all three 
sets. Again, this estimation is certainly overcon- 
strained because these sequences are drawn from 
an incredibly small subset of all possible sequences. 
Nonetheless, in the interest of being cautious, 
we can say that the recording capacity of the 3 x 5 
paradigm is not less than 2.59 bytes nor more 
than 101.25 bytes and likely falls somewhere be- 
tween 29.21 and 101.25 bytes. By also considering 
the ability to control spacer orientation (an extra 
modality), we could potentially encode an addi- 
tional 5 bits per set. Of course, this only reflects 
the information of our current recordings, which 
we arbitrarily limited to 15 spacers. Native species 
have been found with as many as 458 spacers in a 
single cell (S. tokodati) (40). This illustrates the 
potential space to encode complex biological 
phenomena, such as the transcriptional time 
course of many genes in a cell by means of re- 
verse transcription of mRNA protospacers (47). 
We anticipate that such a recording system will be 
valuable in applications that require tracing long 
histories of in vivo cellular activity, including devel- 
opment, lineage, and activity in the brain (42, 43). 


Materials and methods 
Bacterial strains and 
culturing conditions 


Expression and new spacer acquisition were car- 
ried out in BL21-AI cells. Unless otherwise spe- 
cified, cells were grown in Luria Broth (LB) shaking 
(240 rpm) at 37°C. Genes expressed from the T7lac 
promoter were induced using L-arabinose (Sigma- 
Aldrich) at a final concentration of 0.2% (w/w) 
from a 20% stock solution in water and isopropyl- 
beta-p-thiogalactopyranoside (IPTG; Sigma-Aldrich) 
at a final concentration of ImM from a 100mM 
stock solution in water. Cas mutants expressed 
from the pLtetO promoter were induced via 


anhydrotetracycline (aTc; Clontech) at a final 
concentration of 214nM from a 214uM stock in 
50% ethanol. While expressing from the pLtetO 
promoter, 0.2% glucose was added to reduce un- 
intended background expression from the T7lac 
promoter. For new spacer acquisition experiments 
not involving oligo-derived spacers, cells were 
induced and grown overnight (16h). All cloning 
was performed using NEB5a cells. 


Cloning and library construction 


Plasmid containing Cas1 and Cas2 under the 
expression of a T7lac promoter (pWUR 1+2) 
was a generous gift of Udi Qimron (23). A variant 
of this plasmid was created harboring an addi- 
tional CRISPR array based on an array found in 
the K12 strain. This additional array was syn- 
thesized and cloned into pWUR 1+2 to generate 
pWUKI 1+2. Cas1+2 were cloned into pRSF-DUET 
for a different plasmid context (pRSF-DUET 1/2). 
Casi and Cas2 were extracted from pWUR 1+2 by 
PCR and re-cloned into the same plasmid sep- 
arately. In the case of Casl, the selection was also 
changed in this step from spectinomycin to ampi- 
cillin to create pWURA Casi and pWUR Cas2. The 
point mutation E9Q was introduced into Cas2 by 
PCR to generate pWUKI Cas1+Cas2 E9Q, Similarly, 
point mutants of Cas1+2 based on mutants from 
the directed evolution experiment were created by 
PCR. Mutant 89 from the directed evolution ex- 
periment was cloned into pWUR 1+2 along with 
a terminator, pLtetO, and the tetR repressor from 
pJKR-H-tetR (44) to create pWUR 1+2 tetO mut89. 
Mutant library was created via error-prone PCR 
using GeneMorph II Random Mutagenesis Kit 
(Agilent) and cloned into ElectroTen-Blue ultra- 
competent cells (Agilent) before being transferred 
to the expression strain (BL21-AI). For additional 
details see plasmid table (table S2). 


Oligo protospacer electroporation 


For spacer acquisition experiments involving 
oligo-derived spacers, cells were first grown over- 
night from individual plated clones. In the morn- 
ing, 100ul of the overnight culture was diluted 
into 3ml of LB, with induction components as 
dictated by the experiment. Cells were grown 
with inducers for 2h. For an individual exper- 
imental condition, 1m1 of this culture was pel- 
leted and re-suspended in water. Cells were 
further washed by two additional pelleting and 
re-suspension steps, then pelleted a final time 
and re-suspended in 50ul of a 3.125uM solution 
of double stranded oligonucleotides (unless other- 
wise noted) synthesized by IDT (Integrated DNA 
Technologies). All pelleting steps were via cen- 
trifugation at 13,000xg for 1 min and the entire 
process from the first pelleting to the final re- 
suspension was carried out at 4°C. Finally, the 
cell-oligo mixture was transferred to a Imm gap 
cuvette and electroporated using a Bio-Rad gene 
pulser set to 1.8 kV and 25 uF with pulse controller 
at 200 Q. Only those conditions with an electro- 
poration time constant > 4.0 ms were carried 
through to analysis. Immediately after electro- 
poration, cells were transferred into a culture tube 
containing 3ml of LB and grown for 2h (unless 
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otherwise noted). At this time, 50ul of the 
culture was lysed by heating to 95°C for 5 min, 
cooled, then either used directly for analysis or 
saved for later analysis at -20°C. For multi-day 
recordings, 50ul of the culture was used to in- 
oculate an overnight culture (in the absence of 
inducers) to restart the process the next day. 


Analysis of spacer acquisition 


Qualitative assessment of new spacer acquisi- 
tion was achieved by PCR across the array (for 
all expansions) or PCR from either side of the 
array with the opposite primer matching the 
oligo that was electroporated (for sequence-specific 
acquisition). New spacer sequences were assigned 
to their origin in initial experiments by TOPO 
cloning (ThermoFisher) the expanded amplicons, 
followed by Sanger sequencing of the resulting 
colonies. For the majority of experiments, how- 
ever, acquisition events were assessed by sequenc- 
ing a library of all expanded and unexpanded 
arrays for a given condition using an Illumina 
MiSeq sequencer. Libraries were created from an 
initial PCR across the genomic array, then single- 
or dual-indexed using NEBNext Multiplex Oligos 
(NEB). Up to 96 conditions were run per flow cell. 
A list of oligo protospacers used can be found in 
table S4. 


Processing and analysis of MiSeq data 


Sequences were analyzed using custom written 
software (Python). Briefly, spacer sequences were 
extracted from reads based on their arrangement 
between identifiable repeat sequences (four mis- 
matches permitted in the repeat to allow for errors 
in sequencing), then compared against the se- 
quences of spacers that populated the array prior 
to the experiment (five mismatches allowed against 
old spacers) to identify new spacers. At this time, 
metrics were collected as to the number of ex- 
panded versus unexpanded arrays, the number 
of expansions in each array, the position of new 
expansions, and the length of new spacers. The 
sequences of new spacers were then blasted (NCBI, 
blastn) against a database containing the genome, 
plasmid, and any electroporated oligo sequences. 
From this, origin and orientation were determined 
as was the protospacer flanking sequence for PAM 
analysis. To analyze the recordings over time, all 
reads containing double and triple expansions 
were analyzed. Oligo-derived sequences were 
identified based on their frequency among all 
new spacers, then, if applicable, set identifiers 
were extracted based on their known location in 
the sequences and sets of oligo-derived sequences 
were assembled. The order of all oligo-derived 
spacers relative to each other and genome- or 
plasmid-derived spacers in pairwise comparisons 
in all double and triple expanded arrays was as- 
sessed. Then, those values were used to test all 
ordered permutations of the oligo-derived across 
each of the ordering rules. Sets were analyzed 
independently. An estimate of the time course of 
spacer acquisition was inferred by relative qPCR 
Ct values at all time points, referenced to a quan- 
titative analysis of expansions by MiSeq at the 
two-hour time point. Library sizes for various mu- 
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tant libraries were estimated by sequencing of 
fragmented mutant amplicons on a MiSeq se- 
quencer. Sequence diversity was estimated as 
Sy = Sobs + a where S,),; is the number of ob- 
served unique sequences in the sample, F1 is the 
number of sequences with a single occurrence and 
F2 is the number of sequences with exactly two 
occurrences (45). 


Statistics 
See table SI. 
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Orbital angular 


momentum microlaser 


Pei Miao,’* Zhifeng Zhang,’* Jingbo Sun,’* Wiktor Walasik,’ Stefano Longhi,” 


Natalia M. Litchinitser,’+ Liang Feng"+ 


Structured light provides an additional degree of freedom for modern optics and practical 
applications. The effective generation of orbital angular momentum (OAM) lasing, especially 
at a micro- and nanoscale, could address the growing demand for information capacity. 

By exploiting the emerging non-Hermitian photonics design at an exceptional point, we 
demonstrate a microring laser producing a single-mode OAM vortex lasing with the ability 
to precisely define the topological charge of the OAM mode. The polarization associated 
with OAM lasing can be further manipulated on demand, creating a radially polarized vortex 
emission. Our OAM microlaser could find applications in the next generation of integrated 
optoelectronic devices for optical communications in both quantum and classical regimes. 


ight typically consists of a stream of linearly 
polarized photons, traveling in a straight 
line and carrying a linear momentum. How- 
ever, it was recognized that beyond the 
linear momentum, circularly polarized light 
carries angular momentum (7). The angular mo- 
mentum associated with the polarization degree 
of freedom, or spin angular momentum (SAM), 
can take only one of two values +h. In addition to 
the SAM, it was also demonstrated that a light 
beam can carry orbital angular momentum (OAM) 
(2). Such beams possess helical phase fronts so 
that the Poynting vector within the beam is 
twisted with respect to the principal axis. This 
fundamental discovery of an OAM opened a new 
branch of optical physics, facilitating studies rang- 
ing from rotary photon drag (3), angular uncer- 
tainty relationships (4), and rotational frequency 
shifts (5), to spin-orbital coupling (6). The OAM 
degree of freedom has enabled technological ad- 
vances, for example, edge-enhanced microscopy 
(7). Moreover, in contrast to the SAM that can 
take only two values, the OAM is unbounded. 
OAM beams are thus being considered as poten- 
tial candidates for encoding information in both 
quantum and classical systems. The combined 
use of spin and orbital angular momenta is 
expected to enable the implementation of entirely 
new high-speed secure optical communication 
and quantum teleportation systems in a multi- 
dimensional space (8), satisfying the exponen- 
tially growing demand worldwide for network 
capacity. 
To date, most of the light sources only produce 
relatively simple light beams with spatially hom- 
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ogeneous polarization and planar wavefront. Gen- 
eration of the complex OAM beams usually relies 
on either bulk devices, such as spiral phase plates, 
spatial light modulators, and computer-generated 
holograms (7), or recently developed planar op- 
tical components, including phase modulation- 
based metasurfaces (9-14), q-plates (15), and silicon 
resonators (16). Although the science of the OAM 
light beams on the micro- and nanoscale is still 
in its early days, it is likely to advance our 
knowledge of light interaction with conventional 


and artificial atoms (e.g., quantum dots) pro- 
vided that the OAM beam is focused to sub- 
wavelength dimensions (1/7), facilitating on-chip 
functionalities for micromanipulation and micro- 
fluidics. Nevertheless, it remains a grand chal- 
lenge to integrate the existing approaches for 
OAM microlasers on-a-chip. For an ultimate min- 
iaturized optical communication platform, there 
is a necessity of independent micro- and nano- 
scale laser sources (18) emitting complex vector 
beams carrying the OAM information. 

One approach to creating an OAM laser (19) 
is based on combining a conventional bulk laser 
with additional phase-front shaping components. 
Despite being straightforward, this approach re- 
lies on rather different device technologies and 
material platforms, and therefore it is not easily 
scalable and integratable. On the contrary, here 
we integrate the advantages of semiconductor 
microlasers with the pronounced changes in light 
propagation at the exceptional point to realize a 
fundamentally new, compact, active OAM source 
on a complementary metal-oxide-semiconductor 
(CMOS) compatible platform. We consider a mic- 
roring cavity that supports whispering gallery 
modes (WGMs). These modes circulate inside 
the cavity and carry large OAM. However, because 
of the mirror symmetry of a ring cavity, clockwise 
and counterclockwise eigen-WGMs can be simul- 
taneously excited, and their carried OAMs con- 
sequently cancel each other. This is evidenced by 
the quantized phase, taking values of either 0 or 
m, azimuthally distributed in the ring, which re- 
sults from the interference between two counter- 
propagating WGMs (fig. S1) (20). To observe the 
OAM of an individual WGM, it is essential to 


Fig. 1. Design of OAM microlaser. (A) Schematic of the OAM microlaser on an InP substrate. The diam- 
eter of the microring resonator is 9 um, the width is 1.1 um, and the height is 1.5 um (500 nm of InGaAsP 
and 1 um of InP). Thirteen-nanometer Ge single-layer and 5-nm Cr/1l-nm Ge bilayer structures are 
periodically arranged in the azimuthal direction on top of the InGaAsP/InP microring, mimicking real index 
and gain/loss parts of an EP modulation at n’ = n” = 0.01 to support unidirectional power circulation. The 
designed azimuthal order is N = 56 at the resonant wavelength of 1472 nm. Equidistant sidewall scatters 
with a total number of M = 57 are introduced to couple the lasing emission upward, creating an OAM vortex 
emission with a helical wavefront. Its topological charge is defined by / = N — M = -1. (B) Simulated phase 
distribution of emitted light. A spiral phase map for an OAM charge-one vortex is clearly demonstrated. 
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introduce a mechanism of robust selection of 
either clockwise or counterclockwise mode. In 
conventional bulk optics, unidirectional ring lasers 
have been demonstrated by implementing a non- 
reciprocal isolator in the light path. The optical 
isolator breaks the reciprocity between counter- 
propagating waves, facilitating the desired uni- 
directional flow. This approach, however, is not 
feasible at the micro- and nanoscale, as the reali- 
zation of micrometer-sized isolators is extremely 
challenging. 

To overcome this fundamental limitation, we 
realize the unidirectional power circulation by 
introducing complex refractive-index modulations 
to form an exceptional point (EP) (Fig. 1A). Driven 
by non-Hermiticity (i.e., gain and loss in optics) 
(21, 22), an EP occurs when multiple eigenstates 
coalesce into one (23-26). In our device, EP ope- 
ration is essential to obtaining OAM laser emis- 
sion (20). The microring laser resonator is designed 
with 500-nm-thick InGaAsP multiple quantum 
wells on an InP substrate. The complex refractive- 
index grating is achieved by placing on top of 
InGaAsP along the azimuthal direction (8) pe- 
riodically alternate single-layer Ge and bilayer 
Cr/Ge structures, corresponding to the refrac- 
tive index (n’) and gain/loss (m”) in the cavity, 
respectively: 


1 
in" for 2np/N < 0 < 2n Pig /N 
An = 3 5 
n! for an(p+3)/N< 9<2n(p+) (6 


(1) 


where WN denotes the azimuthal number of the 
targeted WGM and p takes integer values from 
the set {0, N - 1}. An EP is obtained when the 
amplitudes of index and gain/loss gratings are 
set equal (i.e., 2’ = n"). At EP, the Fourier trans- 
form of the complex refractive-index modulation 
is one-sided, yielding one-way distributed feedback 
(27-29) and robust unidirectional laser emission 
above threshold, as shown by a detailed semicon- 
ductor rate equation analysis (20). As a result, 
the counterclockwise WGM unidirectionally cir- 
culates in the cavity carrying large OAM through 
the azimuthally continuous phase evolution (figs. 
82 and S3) (20). 

The OAM associated with the unidirectional 
power flow is extracted upward into free space 
by introducing sidewall modulations periodically 
arranged along the microring perimeter (J6). The 
azimuthal phase dependence of the targeted uni- 
directional Nth WGM is given by @ = NO. The 
sidewall modulations coherently scatter light, 
with the phase continuously varying in azimuthal 
direction, defined by the locations of the scatters 
(Fig. 1A, inset). For M equidistant scatters, the 
locations of the scatters are given by 8, = 2ns/M, 
where s € {0,M - 1}, resulting in the extracted 
phase 0, = 2nsN’/M that carries OAM. Because 
the physically meaningful phase is measured 
modulo 27, we can subtract 27s from each of the 
extracted phases and derive 


Qs = 2ns(N-M)/M (2) 


SCIENCE sciencemag.org 


Equation 2 shows that the extracted phase 
increases linearly from 0 to 2n(N — M), thereby 
creating a vortex beam with topological charge 


l1=WN - M. Figure 1B shows the modeling result 
of the vortex laser emission from our OAM mic- 
rolaser, where N = 56 and M = 57. The phase of 


Fig. 2. Scanning electron microscope images of OAM microlaser. The OAM microlaser was fabricated 
on the InGaAsP/InP platform. Alternating Cr/Ge bilayer and Ge single-layer structures were periodically 
implemented in the azimuthal direction on top of the microring, presenting, respectively, the gain/loss 
and index modulations required for unidirectional power circulation. 
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Fig. 3. Characterization of OAM lasing. (A) Evolution of the light emission spectrum from PL, to ASE, and 
to lasing at 1474 nm, as the peak power density of pump light was increased from 0.63, to 0.68, to 2.19 GWm °, 
respectively. (B) Input-output laser curve, showing a lasing threshold of ~1 GW m°®. (C) Far-field intensity 
distribution of the laser emission exhibiting a doughnut-shaped profile, where the central dark core is due 
to the phase singularity at the center of the OAM vortex radiation. (D) Off-center self-interference of the 
OAM lasing radiation, showing two inverted forks (marked with arrows) located at two phase singularities. 
Originating from the superposition of central helical and outer quasiplanar phases intrinsically associated 


with OAM, the double-fork pattern confirms the OAM vortex nature of the laser radiation. 
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pepo pede. 


Fig. 4. Polarization state of OAM lasing. Measured intensity distributions of the OAM lasing radiation passing through a linear polarizer with different polar- 
ization orientations indicated by arrows: (A) O° (B) 90° (C) 45° and (D) —45° The two-lobe structure rotated with the rotation of the polarizer in the same fashion, 


confirming radially polarized OAM lasing. 


the electric field changes by 2x upon one full 
circle around the center of the vortex. The phase 
is continuous everywhere except for the center 
of the emission path, presenting a topological 
phase singularity point at the beam axis. The 
topological charge of the vortex emission can 
be viewed as the number of twists done by the 
wavefront in one wavelength, exhibiting OAM 
lasing of charge J = -1. 

The OAM microlaser with the EP modula- 
tion by periodically arranged Ge and Cr/Ge (Fig. 
2) was fabricated by means of overlay electron 
beam lithography (20). The unidirectional power 
flow oscillating in the cavity eliminates the un- 
desired spatial hole-burning effect that would 
be created by the interference pattern of two 
counterpropagating WGMs. The preferential 
gain saturation in the antinodes of the interfer- 
ence pattern would cause spatial gain inhomo- 
geneity, leading to a decrease in the laser slope 
efficiency, multilongitudinal mode operation, 
and unstable laser emission. In our OAM mic- 
rolaser, unidirectional power flow forced at the 
EP modulation (fig. S3) (20) enables efficient and 
stable single-mode lasing with a sideband sup- 
pression ratio of ~40 dB (Fig. 3A). In the tran- 
sition from broadband photoluminescence (PL), 
to amplified spontaneous emission (ASE), and 
finally to lasing (Fig. 3, A and B), the emission 
peak stabilized at the same resonant wavelength, 
demonstrating the avoidance of multimode osci- 
llation typically existing in a microring cavity. 
The OAM characteristics, such as the vortex 
nature and the phase singularity, were charac- 
terized by analyzing the spatial intensity profile 
of lasing emission and its self-interference (fig. 
S4) (20). In the far field, we observed the inten- 
sity of lasing emission spatially distributed in a 
doughnut shape with a dark core in the center 
(Fig. 3C). The observed dark center is due to 
the topological phase singularity at the beam 
axis where the phase becomes discontinuous, 
as predicted in Fig. 1B. The presence of the OAM 
was then validated by the self-interference of two 
doughnut-shaped beams split from the same 
lasing emission. In each doughnut beam, be- 
cause of its OAM, optical phase varies more 
markedly with a helical phase front close to the 
central singularity area, whereas the outer dough- 
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nut area is of a relatively uniform quasiplanar 
phase front. At the observation plane, we inten- 
tionally created a horizontal offset between two 
doughnut beams, so that the dark center of one 
beam overlapped with the bright doughnut area 
of the other, and vice versa. The resulting in- 
terference patterns between the helical and 
quasiplanar phase fronts revealed two inverted 
forks (Fig. 3D), as the quasiplanar and helical 
phases were reversed at the centers of two 
doughnuts. For both of them, the single fringe 
split into two at the fork dislocation, evidently 
confirming that the radiation from our OAM 
laser was an optical vortex of topological charge 
t=-1, 

The polarization properties of the demonstra- 
ted OAM microlaser can be designed on demand. 
In particular, radially polarized beams, charac- 
terized by a nonuniform spatial distribution of 
their polarization vector, have enabled unique 
functionalities, such as high-spatial resolution 
microscopy by their sharp focusing (30). Although 
the conventional schemes require external op- 
tical components, such as geometric phase-based 
diffraction elements (9), radially polarized beams 
can be directly produced from our OAM micro- 
laser. In a microring cavity, the resonant mode 
can be designed to be either quasi-transverse 
magnetic (TM) or quasi-transverse electric (TE). 
The radially polarized component of the quasi- 
TM mode is tightly confined at the microring 
perimeter and sensitive to sidewall modulations, 
facilitating the outcoupling of this mode from the 
laser (fig. S5) (20). Therefore, in our microring 
cavity, the dominant oscillating mode is designed 
to be a quasi-TM mode, and its scattering by the 
sidewall modulation results in the radially polar- 
ized OAM lasing. In experiments, the polarization 
state of the OAM lasing was validated. After trans- 
mission through a linear polarizer, the doughnut 
profile splits into two lobes aligned along the 
orientation of the polarizer (Fig. 4). The two 
lobes remained parallel to the polarization axis 
regardless of the rotation of the polarizer, man- 
ifesting pure radially polarized OAM lasing. 
Additionally, in contrast to linearly polarized 
OAM modes that are not compatible with op- 
tical fibers, fibers can support radially polar- 
ized OAM eigenmodes. 


We have demonstrated a microring OAM laser 
producing an optical vortex beam with an on- 
demand topological charge and vector polariza- 
tion states. This is enabled through combined 
index and gain/loss modulations at an EP, which 
breaks the mirror symmetry in the lasing gen- 
eration dynamics and facilitates the unidirec- 
tional power oscillation. Finally, OAM vector 
laser beams might offer novel degrees of freedom 
for the next generation of optical communica- 
tions in both classical and quantum regimes. 
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ELECTROCHEMISTRY 


Nanostructured transition metal 
dichalcogenide electrocatalysts for 
CO, reduction in ionic liquid 


Mohammad Asadi,’ Kibum Kim,”?* Cong Liu,?* Aditya Venkata Addepalli,’ 

Pedram Abbasi,’ Poya Yasaei,’ Patrick Phillips,* Amirhossein Behranginia,* 

José M. Cerrato,” Richard Haasch,° Peter Zapol,’ Bijandra Kumar,” Robert F. Klie,* 
Jeremiah Abiade,’ Larry A. Curtiss,*+ Amin Salehi-Khojin'+ 


Conversion of carbon dioxide (CO2) into fuels is an attractive solution to many energy and 
environmental challenges. However, the chemical inertness of CO2 renders many 
electrochemical and photochemical conversion processes inefficient. We report a transition 
metal dichalcogenide nanoarchitecture for catalytic electrochemical CO2 conversion to carbon 
monoxide (CO) in an ionic liquid. We found that tungsten diselenide nanoflakes show a current 
density of 18.95 milliamperes per square centimeter, CO faradaic efficiency of 24%, and CO 
formation turnover frequency of 0.28 per second at a low overpotential of 54 millivolts. We also 
applied this catalyst in a light-harvesting artificial leaf platform that concurrently oxidized water 


in the absence of any external potential. 


lectrochemical or photochemical reduction 

of carbon dioxide (CO,) could in principle 

conveniently recycle the greenhouse gas back 

into fuels (7-6). However, existing catalysts 

are too inefficient in practice (7-17): Either 
weak binding interactions between the reaction 
intermediates and the catalyst give rise to high 
overpotentials, or slow electron transfer kinetics 
result in low exchange current densities. Both of 
these metrics depend not only on the intrinsic 
electronic properties of the catalyst, but also 
on the solvent and the catalyst morphology. Re- 
cently, we reported that three-dimensional (3D) 
bulk molybdenum disulfide (MoS,) catalyzes CO. 
reduction to CO at an extremely low overpo- 
tential (54 mV) (12) in an ionic liquid (IL). Here, 
we report 2D nanoflake (NF) architectures of 
this and other transition metal dichalcogenides 
(TMDCs) that manifest much higher performance 
for electrocatalytic CO, reduction in the IL 1-ethyl- 
3-methylimidazolium tetrafluoroborate (EMIM-BF;,). 
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CO, reduction activities of similarly sized 
(~100 nm) TMDC NFs including MoS., WSs, 
MoSez, and WSe, were tested using a rotating 
disc electrode. All TMDCs were grown using a 
chemical vapor transport technique (13). Fig- 
ure 1A shows cyclic voltammetry (CV) results of 
WSe, NFs, and bulk MoS, as well as Ag nano- 
particles (Ag NPs) and bulk Ag as a representa- 
tive noble-metal catalyst. All experiments were 
performed inside a two-compartment, three- 
electrode electrochemical cell (fig. S6) using an 
electrolyte of 50 volume percent (vol %) EMIM- 
BF, and 50 vol % deionized water; this compo- 
sition gives the maximum CO, reduction activity 
(13). The polarization curves of all studied cata- 
lysts were obtained by sweeping potential be- 
tween +0.8 and -0.764 V versus RHE (reversible 
hydrogen electrode; all potentials reported here 
are based on RHE) with a scan rate of 50 mV s* 
(Fig. 1A and fig. S8). We also performed chrono- 
amperometry at different applied potentials for 
WSe, NFs. The results indicate that the obtained 
current densities for all applied potentials are 
10 to 20% less than the CV results with 50 mV/s 
scan rate (fig. S9). The difference is attributed to 
the charging current (capacitive behavior) in the 
CV measurements. 

The CO, reduction began at -0.164 V (over- 
potential of 54 mV) for WSe, NFs, as confirmed 
by faradaic efficiency (FE) measurements (Fig. 
1B). At this potential (overpotential of 54 mV), a 
current density of 18.95 mA/cm? (normalized 
on the basis of geometrical surface area) was ob- 


tained for WSe, NFs; by comparison, current den- 
sities were 0.19 mA/cm” for bulk Ag, 1.57 mA/cm” 
for Ag NPs, and 3.4 mA/cm? for bulk MoS.. The 
CO formation FEs for WSe, NFs (Fig. 1B) and 
bulk MoS, (12) were 24% and 3%, respectively. 
However, the Ag NPs and bulk Ag did not reduce 
CO, at this overpotential. At -0.764 V potential, 
the recorded current density for WSe, NFs was 
330 mA cm”, versus 3.3 mA cm” for bulk Ag, 
11 mA cm” for Ag NPs, and 65 mA cm ® for bulk 
MoS,. The CO formation turnover frequency (TOF) 
(Fig. 1C) (73), a measure of per-site activity of 
catalysts to produce CO, was 0.28 s* for WSe, 
NFs versus 0.016 s ‘ for bulk MoS,. However, this 
value was zero for Ag NPs, as they could not 
produce CO at this overpotential (54 mV). Figure 
1C also shows that the CO formation TOF of 
WSe, was approximately three orders of mag- 
nitude higher than that of Ag NPs in the over- 
potential range of 150 to 650 mV. 

Gas chromatography and differential electro- 
chemical mass spectroscopy analyses indicated 
that CO and Hp were the only gas-phase products 
(5, 11, 12, 14-16) in the potential range of 0 to 
-0.764 V (13). The measured FE for WSe. NFs/IL 
(Fig. 1B) showed that this system is highly sel- 
ective for CO formation at high potentials (-0.2 
to -0.764 V). However, at smaller potentials (-0.164 
to -0.2 V), it produces a mixture of CO and Hy 
(synthesis gas). Figure S13 shows the selectivity 
(FE) results of all TMDCs tested in this study (73). 

The catalytic performance of TMDC NFs was 
compared with that of other reported catalysts 
(Fig. 1D) by multiplying current density (activity) 
by CO formation FE (selectivity). At 100 mV 
overpotential, the performance of WSe. NFs ex- 
ceeded that of bulk MoS, and Ag NPs tested under 
identical conditions in an ionic liquid by a factor 
of nearly 60. The performance of WSe, NFs also 
exceeds those of Au NPs (17) and Cu NPs (78) by 
three orders of magnitude. Additionally, at this 
overpotential, the performance of WSe, exceeded 
that of WS, and MoSe, NFs by factors of 3 and 2, 
respectively (Fig. 1D). We also performed chro- 
noamperometry experiments to examine the elec- 
trochemical stability of WSe. NFs in 50:50 vol 
% IL/deionized water. At the applied potential of 
-0.364 V (0.254 V overpotential), a small decay 
(10%) was observed after 27 hours of continuous 
operation of the three-electrode two-compartment 
cell (fig. S14) (73). 

The photochemical performance of WSe./IL 
was also studied using a custom-built wireless 
setup. This artificial leaf mimics the photosynthesis 
process in the absence of any external applied 
potential. The cell (Fig. 2A) (13) is composed of 
three major segments: (i) two amorphous silicon 
triple-junction photovoltaic (PV-a-si-3jn) cells in 
series to harvest light, (ii) the WSe,/IL cocatalyst 
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Fig. 1. CO2 reduction performance of the TMDC catalysts, Ag NPs, and bulk Ag in the EMIM-BF, 
solution. (A) Cyclic voltammetry (CV) curves for WSe2 NFs, bulk MoSz, Ag nanoparticles (Ag NPs), 
and bulk Ag in CO> environment. Inset shows the current densities in low overpotentials. (B) CO and 
Ho overall faradaic efficiency (FE) at different applied potentials for WSe2 NFs. The error bars represent 
SD of four measurements. (C) CO formation TOF of WSe2 NFs, bulk MoSz, and Ag NPs in IL electrolyte 
at overpotentials of 54 to 650 mV. At 54 mV overpotential, Ag NPs’ result is zero. (D) Overview of 
different catalysts’ performance at different overpotentials (n). All TMDC and Ag NP data were 
obtained from chronoamperometry experiments under identical conditions. Data for other catalysts 
are from (12). 
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Fig. 2. Artificial leaf. (A) Schematic of the cell design. (B) Rate of product formation (mol/s) with respect 
to different Sun illuminations. (C) Calculated solar-to-fuel efficiency (SFE) of photochemical process 
using the WSe>/IL cocatalyst system. Our calculation indicates ~4.6% SFE, which is limited by the 
maximum efficiency of PV-a-si-3jn (~6.0%). Error bars indicate uncertainty in the calculated SFE 
(table S5) (13). 
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system on the cathode side for CO, reduction, 
and (iii) cobalt (Co") oxide/hydroxide in po- 
tassium phosphate pH = 7.0 (KPi) electrolyte 
on the anode side to catalyze the oxygen evolu- 
tion reaction (19-27). Because the CO, reduction 
and oxygen evolution reactions in the artifi- 
cial leaf system are electrically coupled together 
through the photovoltaic, the production and 
consumption rates of electrons and protons are 
equal on the anode and cathode sides of the 
cell. When the reaction starts, proton (H*) gen- 
eration is initiated in the KPi solution (anode 
side) through the oxygen evolution reaction (de- 
creasing the initial pH of 7); on the IL electrolyte 
(cathode side), the CO. reduction reaction con- 
sumes H”* available in the IL electrolyte (increas- 
ing the initial pH of ~3.2). During this transient 
period, diffusion of the K* ions through the pro- 
ton exchange membrane from the anode to the 
cathode side compensates the charge imbal- 
ance to achieve charge neutrality (fig. $17) (13). 
However, after this period (~5 min), the pH in 
the KPi solution and the IL approaches ~3.4, 
where the operation of the artificial leaf system 
reaches steady state and H” diffuses in place of 
K*. Our measurements indicate that 1.43 x 10-* 
M K* diffuses to the IL in the transient stage and 
that its concentration remains constant under 
steady-state operation. This quantification is 
consistent with the change in the H* concen- 
tration (1.52 x 10-* M) on the cathode side. The 
PV-a-si-3jn cell can function continuously for 
5 hours (fig. $18) before corrosion of the trans- 
parent indium tin oxide layer on the anode in- 
hibits operation (fig. S19) (23). However, replacing 
the PV-a-si-3jn cells restores performance to its 
previous level. 

To test the stability of ionic liquid (50 vol % 
EMIM-BF, in water) and KPi electrolytes, we 
replaced the PV-a-si-3jn cells every 4 hours for 
a cumulative time of 100 hours. Results shown 
in table S3 (13) indicate that, within error mar- 
gins, the same molar quantities of CO and Hy 
were produced during each 4-hour period of the 
100-hour-long operation of the artificial leaf, con- 
firming the durability of both anode (KPi) and 
cathode (50/50 vol% IL/water) electrolytes. We 
also observed no significant change in the pH of 
the solution after 100 hours (13). Moreover, our 
calculations (13) indicate that a negligible amount 
of water (~0.018 ml/hour) is produced during the 
CO, reduction reaction on the cathode side rel- 
ative to the total volume of electrolyte (100 ml of 
IL/water) used in our experiment. 

Figure 2B shows the molar rates of product 
formation with respect to simulated solar il- 
lumination (number of Suns). The respective 
yields of CO and Hy measured by GC follow an 
approximately 10:1 ratio for the entire range of 
illuminations. This result is consistent with our 
FE measurements obtained at higher overpoten- 
tials in the three-electrode electrochemical setup. 
We also calculated the solar-to-fuel conversion 
efficiency (SFE) for our photochemical process 
(Fig. 2C), obtaining a value of ~4.6% limited by 
the maximum efficiency of the PV-a-si-3jn cell 
(~6.0%) (13, 20). This SFE is higher than that of 
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the water-splitting reaction (~2.5%) previously 
measured using an identical triple-junction pho- 
tovoltaic (PV-a-si-3jn) cell (20). Our measurements 
indicate that the SFE of the system remains stable 
for 5 hours of continuous operation (fig. S20) (13). 
The SFEs of the artificial leaf during 100 hours of 
operation (with successive PV-a-si-3jn replace- 
ments) are shown in table S6. 

Next, we performed electrochemical imped- 
ance spectroscopy (EIS) at 150 mV overpotential 
to measure the charge transfer resistance (Rx) 
for WSe, NFs, bulk MoS,, and Ag NPs catalysts 
(13, 22-24). A charge transfer resistance is cor- 
related to the number of electrons transferred 
from the catalyst surface to the reactant (25-27) as 
well as intermediate formation inside the double 
layer (22, 23). Our experimental results (Fig. 3) 
indicate that the R., of WSe, is ~180 ohms, ver- 
sus ~420 ohms for bulk MoS, and ~550 ohms 
for Ag NPs. 

Additionally, we measured the work function 
of WSe, and the other TMDCs by ultraviolet 
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Fig. 3. Electrochemical impedance spectroscopy 


of CO>2 reduction using WSez2 NFs, bulk MoSz, 
and Ag NPs at 150 mV overpotential. 
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photoelectron spectroscopy (UPS) (JJ, 13) (fig. 
$22). Our results indicate a considerably lower 
work function of WSe, NFs (3.52 eV) compared 
to bulk MoS, (3.99 eV) (12) and Ag NPs (4.38 eV), 
confirming the EIS data. These results provide 
evidence that the superior electronic properties 
of W edge atoms result in a faster electron trans- 
fer and consequently higher catalytic activity 
during CO, reduction. 

To characterize the atomic arrangement of 
edge atoms, we performed scanning transmis- 
sion electron microscopy (STEM) analysis on 
several liquid-exfoliated monolayers and multi- 
ple layers of WSe, NFs (fig. S23A) (13). The line 
intensity profile of single-layer WSe, NF (fig. 
S23B) indicates that the edges of the nanoflakes 
are W-terminated. Moreover, STEM analysis 
on the edge atoms after 27 hours of chrono- 
amperometry (fig. S14) indicates a stable atomic 
structure of W edge sites (fig. S23, C and D). 
These results suggest that transition metals 
with d-orbital electrons on the edge sites main- 
ly contribute to the CO, reduction without any 
evidence of instability over time. X-ray photo- 
electron spectroscopy data further verified the 
long-term stability of the catalyst (fig. S24, A to 
D) (22, 13). 

Density functional theory (DFT) calculations 
were performed to gain insight into the catalytic 
properties of the TMDC NFs (13). The calculated 
reaction free energies of the CO, — CO pathway 
using a computational hydrogen electron approach 
at zero potential (28) show that the formation of 
COOH* is highly endergonic and is the rate- 
limiting step for both Ag(111) and the Ag;; cluster 
(Fig. 4A). The Ag;, cluster requires less energy 
than Ag(111) to form COOH* because of the pres- 
ence of undercoordinated Ag atoms of the cluster. 
This result explains the lower overpotential of Ag 
nanoparticles relative to bulk Ag, in agreement 
with other studies (7). COOH* formation is sim- 
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Fig. 4. Density functional theory analysis. (A) Calculated free energy diagrams for COz2 electroreduction 
to CO on Ag(111), Agss NPs, MoS2, WS2, MoSe2, and WSe2 NFs at O V RHE. (B) Calculated partial density of 
states of the d band (spin-up) of the surface Ag atom of Agss. (C) Calculated partial density of states of the 
surface bare metal edge atom (W) of the WSez NFs. The calculations of the Ag systems are at CO coverage 
of 1/16 ML; those of the TMDC systems are at CO coverage of 1 ML. See (13) for details of the coverage 


effects in the TMDC systems. 
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ilarly endergonic on other metal surfaces such 
as Pd, Au, and Cu (7, 29, 30). However, on the 
metallic edges of the TMDC NFs, COOH* for- 
mation is exergonic because of strong binding 
to the TMDC metal edge sites. The CO* is also 
much more stable on the TMDC NFs than on 
Ag, residing at lower energy than COOH%*. This 
energetic ordering suggests that the formation 
of CO* from CO, is kinetically more favorable 
on the TMDC NFs than on Ag (Fig. 4A) resulting 
in lower overpotentials. In addition, the calcu- 
lated projected density of states of the edge metal 
atom (Mo or W) reveals that the d-band centers 
of these metal edges (Fig. 4, B and C, and fig. S26, 
A to D) (13) are much closer to the Fermi level 
than those of the Ag(111) surface, further sup- 
porting the strong binding interactions of the 
adsorbed intermediates with the TMDC NFs 
(31, 32). However, the strong binding of CO on 
the TMDCs also inhibits desorption of CO, which 
becomes the rate-limiting step in the TMDC 
systems. Previous studies have indicated that the 
coverage of the adsorbed intermediates signifi- 
cantly affects the binding energies (33, 34). We 
further investigated the effect of CO coverage on 
the metal edge of the TMDCs and found that 
each metal atom on the TMDC edge can bind up 
to two CO molecules (8¢9 < 2 ML) (13). The 
binding energy per CO on a metal atom (when 
8co = 1 ML) ranges from 0.8 to 1.1 eV (Fig. 4A), 
whereas the binding energy per second CO on 
a metal atom (when 9c¢o = 2 ML) decreases to 
0.3 to 0.5 eV. This finding suggests that the metal 
edges of the TMDCs likely have high CO coverage 
(8co > 1 ML) during the catalytic reaction to 
maintain a high turnover rate. 

We also calculated the work functions of the 
monolayers of the four TMDCs (fig. S27). Our 
calculations show a trend of MoS. > WS» > MoSes > 
WSe,, consistent with experimental measurements 
of the work functions (fig. S22) (13). This trend of 
work functions correlates with the trend of the 
experimental activities as measured by current 
densities of the four TMDCs, suggesting that the 
electron transfer properties of the TMDCs play an 
important role in the electrochemical reduction 
of CO.. In this case, WSe2 has the lowest work 
function and is the best TMDC for CO, activation 
among the tested materials. 

The ionic liquid also plays an important role in 
the CO» electrochemical reduction. Our previous 
study suggested that the EMIM* ion helps trans- 
port CO, to the catalyst surface by complex- 
ation under acidic conditions (72). Overall, we 
attribute the exceptional performance of the pres- 
ent catalysts to a combination of low overpo- 
tentials and efficient electron transfer properties 
of the TMDC NFs and the IL-enhanced local CO, 
concentration. 
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PALEOCEANOGRAPHY 


North Atlantic ocean circulation and 
abrupt climate change during the 


last glaciation 


L. G. Henry,’* J. F. McManus,’ W. B. Curry,””? N. L. Roberts,* 


A. M. Piotrowski,* L. D. Keigwin? 


The most recent ice age was characterized by rapid and hemispherically asynchronous 
climate oscillations, whose origin remains unresolved. Variations in oceanic meridional 
heat transport may contribute to these repeated climate changes, which were most 
pronounced during marine isotope stage 3, the glacial interval 25 thousand to 60 thousand 
years ago. We examined climate and ocean circulation proxies throughout this interval at 
high resolution in a deep North Atlantic sediment core, combining the kinematic tracer 
protactinium/thorium (Pa/Th) with the deep water-mass tracer, epibenthic 5'°C. These 
indicators suggest reduced Atlantic overturning circulation during every cool northern 
stadial, with the greatest reductions during episodic Hudson Strait iceberg discharges, 
while sharp northern warming followed reinvigorated overturning. These results provide 
direct evidence for the ocean’s persistent, central role in abrupt glacial climate change. 


nlike the relatively stable preindustrial cli- 

mate of the past 10 thousand years, glacial 

climate was characterized by repeated mil- 

lennial oscillations (7). These alternating 

cold stadial and warm interstadial events 
were most abrupt and pronounced on Greenland 
and across much of the northern hemisphere, with 
the most extreme regional conditions during sev- 
eral Heinrich (H) events (2), catastrophic iceberg 
discharges into the subpolar North Atlantic Ocean. 
These abrupt events not only had an impact on 
global climate but also are associated with wide- 
spread reorganizations of the planet’s ecosystems 
(3). Geochemical fingerprinting of the ice-rafted 
detritus (IRD) associated with the most pronounced 
of these events consistently indicates a source in 
the Hudson Strait (HS) (4), so we abbreviate this 
subset of H events as HS events and their follow- 
ing cool periods as HS stadials. During northern 
stadials, ice cores show that Antarctica warmed, 
and each subsequent rapid northern hemisphere 
warming was followed shortly by cooling at high 
southern latitudes (5). Explanations for the rapid- 
ity and asynchrony of these climate changes require 
a mechanism for partitioning heat on a planetary 
scale, initiated either through reorganization of 
atmospheric structure (6) or the ocean’s thermo- 
haline circulation, particularly the Atlantic me- 
ridional overturning circulation (AMOC) (7-10). 
Coupled climate models have successfully used 
each of these mechanisms to generate time series 
that replicate climate variability observed in pa- 
leoclimate archives (9, 11). We investigated the 
relationship between Northern Hemispheric cli- 
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mate as recorded in Greenland ice cores and 
marine sediments, along with isotopic deep-sea 
paleoproxies sensitive to changes in North Atlan- 
tic Deep Water (NADW) production and AMOC 
transport during marine isotope stage three (MIS3). 
Throughout that time, when global climate was 
neither as warm as today nor as cold as the last 
glacial maximum (LGM), ice sheets of interme- 
diate size blanketed much of the northern hemi- 
sphere, and large millennial stadial-interstadial 
climate swings (6, 8) provide a wide dynamic range 
that allows examination of the ocean’s role in 
abrupt change. 

Sediment samples were taken from the long 
(35 m) core KNR191-CDH19—recovered from the 
Bermuda Rise (33° 41.443’ N; 57° 34.559” W, 4541 m 
water depth) in the northwestern Atlantic Ocean 
(Fig. 1), near previous seafloor sampling at Inte- 
grated Ocean Drilling Program (IODP) site 1063— 
and coring sites KNR31 GPC-5, EN120 GGC-1, 
MD95-2036, OCE326-GGC5, and others. Because 
this region of the deep North Atlantic is charac- 
terized by steep lateral gradients in tracers of 
NADW and Antarctic Bottom Water (AABW), 
the Bermuda Rise has been intensively used to 
explore the connection between changes in ocean 
circulation and climate (7, 12). In this study, we 
measured the radioisotopes °*'Pa and 7°°Th in 
bulk sediment, age-corrected to the time of depo- 
sition, along with stable carbon (8'°C) and oxy- 
gen (8150) isotope ratios in the microfossil shells 
of both epibenthic foraminifera (Cibicidoides 
wuellerstorfi and Nuttallides umbonifera) and 
planktonic foraminifera (Globigerinoides ruber), 
respectively, yielding inferences on relative resi- 
dence times and the origin of deep water masses 
on centennial time scales. 

Isotopes of protactinium and thorium, *"Pa and 
30Th, are produced from the decay of *°U and 
3417, respectively, dissolved in seawater. This 
activity of °2'Pa and °Th in excess of the amount 
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supported by the decay of uranium within the 
crystal lattice of the sediment’s mineral grains is 
denoted by ~"P,,, and »?°Th,. Because the parent U 
isotopes have long residence times, U is well mixed 
throughout the ocean. This yields a *""Pa,,/°°Th,, 
(hereafter Pa/Th) production ratio (Pa/Th = 0.093) 
that is constant and uniformly distributed (73, 14). 
Both daughter isotopes are removed by adsorp- 
tion onto settling particles, with Th more effi- 
ciently scavenged than Pa. The residence time 
of 7? 'Paxs (Tres = ~200 years) in seawater is thus 
greater than that of ?°°Th,, (tres = ~30 years), 
allowing **'Pa,, to be redistributed laterally by 
changes in basin-scale circulation before deposi- 
tion (7, 14-16), with the additional potential in- 
fluence of removal because of changes in particle 
rain associated with biological productivity (77). 
Settling particles (18) and surface sediments 
throughout the basin reveal a deficit in 231D a, 
burial that is consistent with large-scale export 
by the deep circulation (Fig. 1) (79). 

The downcore Pa/Th in core CDH-19 ranges 
from ~0.05 to slightly above the production ratio 
of 0.093, with a series of well-defined variations 
throughout MIS3 (Fig. 2). In sediments deposited 
during Greenland interstadial intervals (1), Pa/Th 
ratios average 0.0609 + 0.0074 (20), which is sub- 
stantially below the production ratio (Fig. 2) and 
only 10% higher than the mean value (Pa/Th = 
0.055) of the Holocene, a time of relatively vig- 
orous AMOC (7). Because 7° Th, is buried in near 
balance with its production (20), the relatively low 
Pa/Th indicates a substantial lateral export of 
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?31pa..., which is consistent with relatively vigor- 
ous AMOC during interstadials, although the ver- 
tical integration through the water column of this 
deficit does not distinguish whether this export 
occurred at deep or intermediate levels. Epibenthic 
8¥C (8"Cgp) data allow discrimination between 
these two possibilities and display increased values 
during each interstadial, implying a greater con- 
tribution of the isotopically more positive North 
Atlantic end member (Fig. 2). During these in- 
tervals, this positive isotopic signal suggests that 
a deeper overturning cell was established, rather 


than a shallower, yet more vigorous one. This con- 
firms a previous suggestion of intervals of rela- 
tively strong AMOC within the most recent ice 
age (21, 22), although Pa/Th and 8”’Cp, adjusted 
for whole-ocean inventory changes (23) rarely 
reach early Holocene values. 

Pa/Th increases within each Greenland stadial 
interval, for a mean duration of 0.531 + 0.303 
thousand years to a Pa/Th value of 0.0797 + 0.0154, 
which indicates decreased lateral export of *?'Pa,, 
and is consistent with a shallower or reduced over- 
turning cell in the North Atlantic. During these 


Ocean Data View / DIVA 


Fig. 1. Study core location and coretop distribution of Pa/Th. Location sediment core CDH19 
indicated with a star (33° 41.443’ N; 57° 34.559’ W, 4541-m water depth), with Pa/Th ratios (black 
dots) in core top sediments used with Ocean Data View Data-Interpolating Variational Analysis 
gridding to produce the color contours. White areas contain no data. 


40 45 50 55 


( Fig. 2. Climate and circulation 
indices through MIS3. Stadials 
are numbered with vertical bars. 

7 (A) NGRIP ice core 8!°Oice 75.1°N, 
42.32°W (35). (B) SST (°C) from 
MD95-2036, 33° 41.444'N, 57° 

| 34.548'W, 4462 m (31). (C) Calcium 
x-ray fluorescence (orange) from 

core CDH19 (this study) mapped to 
%CaCOz, with calibration r? = 0.87 
(S.1), with spectral reflectance (blue) 
from core MD95-2036 (36). (D) Pa/Th 
from bulk sediment (green) taken 
from core CDH19. (E) 5'°Cgr from core 
CDH19 (purple) alternates between 
values consistent with southern and 
northern sourced 8Cp- end 
members. 
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Fig. 3. Detail of millennial cyclicity in glacial climate and the deep ocean. (A) through (E) are as in 
Fig. 2, Ato E. (F) Simulated NADW (Sv) in a coupled ocean/atmosphere model (11), with (D) published Pa/ 
Th (gray squares) (21) and 8'°Cg¢ data (blue crosses) (12). 


stadials, 5'°Cpp decreases substantially to nega- 
tive values [-0.2 per mill (%o) to -0.5%o], suggest- 
ing greater influence of the glacial equivalent of 
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modern Antarctic Bottom Water (AABW), an iso- 
topic result that is consistent with reduced AMOC 
from a coupled climate model (10). Although the 


northern and southern water mass end members 
are not well known throughout the last glacia- 
tion, deep waters in the Atlantic during the LGM 
ranged from less than -0.5%o in the south to 
greater than 1.5%o in the north (23). If these values 
prevailed throughout MIS3, then the low S°Cor 
indicates a dominant stadial influence of south- 
ern waters and substantial northward retreat or 
shoaling of the AABW/NADW mixing zone, 
which is consistent with the deep water mass con- 
figuration that has previously been reconstructed 
for the LGM (23, 24), although not for millennial- 
scale stadial intervals within the glaciation. 

The mean Pa/Th of both stadials and inter- 
stadials is consistent with export of 7*'Pa,, from 
the subtropical North Atlantic during most of 
MIS3. During peak interstadials, when low Pa/Th 
indicates the local burial of approximately half 
of **'Pa,, production, the remaining half would 
have been exported. In contrast, the substantial 
decrease in the lateral export of 2?'Pa,,, evident 
in higher Pa/Th, along with lower 5”Cp,p dur- 
ing each stadial interval, points to repeated re- 
ductions in AMOC and its attendant northward 
heat transport throughout MIS3. The contrast 
between apparent deep, vigorous overturning 
during interstadials and shallower (25), weaker 
overturning during stadials is most pronounced 
in conjunction with all HS stadials (Fig. 2), when 
catastrophic discharge of melting icebergs from 
Canada flooded the subpolar North Atlantic (4). 

Sediments deposited during HS stadials are 
characterized by a mean duration of 1.65 + 0.545 
thousand years and an average Pa/Th of 0.095 + 
0.016, which is indistinguishable from the pro- 
duction ratio. These results therefore indicate no 
net export of °31Da.., from the subtropical North 
Atlantic during these events sourced from the 
Hudson Strait. This balance between seawater 
radiometric production and underlying sedi- 
mentary burial would be expected under con- 
ditions with a substantial reduction in AMOC 
or other lateral transport and might imply a near 
cessation of **'Pa,, export through deep circula- 
tion. Although variable scavenging may also con- 
tribute to sedimentary Pa/Th, values throughout 
MIS3 bear only a weak relationship with bulk 
and opal fluxes [coefficient of determination (7 y= 
0.19] (19), which therefore constitute secondary 
influences. 

These new results reveal that AMOC variations 
were associated with every MIS3 stadial-interstadial 
oscillation, with the largest reductions during HS 
stadials. The well-resolved interval 35 thousand 
to 50 thousand years ago provides a good ex- 
ample (Fig. 3). This iconic interval contains H4, 
H5, and the intervening series of oscillations that 
have served as a basis for conceptual and com- 
puter models seeking to explain such variability 
(8-11, 26, 27). A previous Pa/Th record (27) cover- 
ing this interval captured much of the overall am- 
plitude, and the new data resolve each stadial 
increase in Pa/Th, indicating that only HS4 and 
HS5 reach the production ratio of 0.093. Because 
the interstadial values are similar to each other, 
the subsequent abrupt increases in AMOC and 
regional warming are also the greatest and occur 
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Fig. 4 Phasing lag correlations. Correlation of NGRIP ice core 880 with 
CDH19 CaCO3 flux (blue), Pa/Th of bulk sediment from CDH19 (green), 
3Cpr from CDHI19 (purple), and SST °C from MD95-2036 (31) (red). 


within the century-scale response time of Pa/Th. 
Throughout the records, the Pa/Th and 8”Cpp 
bear a striking similarity to model output forced 
by freshwater anomalies (11). 

Combined with previous investigations (7, 28), 
these new results confirm that all HS events of 
the past 60 thousand years were associated with 
a dramatic increase in Pa/Th and are evidence 
for major reduction in AMOC in association 
with the largest IRD events (29). In contrast, H3, 
the sole Heinrich event stadial that fails to reach 
the production ratio (peak Pa/Th = 0.079), dis- 
plays smaller IRD fluxes across the subpolar At- 
lantic (29), with provenance inconsistent with 
a Hudson Strait source (4). This muted result for 
H3 is consistent with evidence from the Florida 
Straits (30) showing a smaller reduction at that 
time in the northward flow of near-surface waters 
that feed the overturning circulation. As with all 
stadials, the HS events are characterized by lower 
5°Cpp, suggesting diminished influence of NADW 
and proportionately greater AABW on Bermuda 
Rise. Combined Pa/Th and 8'°Cpy results there- 
fore indicate a persistent pattern of stadial weak- 
ening and interstadial strengthening, with a 
repeatedly largest reduction in AMOC associated 
with all HS events. Although these observations 
are consistent with a number of numerical model 
simulations (11, 27) as well as conceptual models 
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for the mechanisms of abrupt 
change, they have previous- 
ly been difficult to document 
and fully resolve. 

Recent data from the 
Western Antarctic ice sheet 
provide compelling evidence 
for a robust lead of Green- 
land climate over Antarctica 
(5). That analysis revealed a 
Northern Hemisphere lead 
of 208 + 96 years, indicating 
that the interhemispheric 
teleconnection propagates 
from north to south on time 
scales consistent with basin- 
scale ocean circulation. To 
ascertain whether Northern 
Hemisphere climate is forced 
or reinforced by changes 
in AMOC, we investigated 
the phase relationship be- 
tween surface and deep-sea 
properties. Cross-correlations 
were performed on each 
of 8°Cpy, Pa/Th, sea surface 
temperature (SST), and 
CaCOz with North Green- 
land Ice Core Project (NGRIP) 
8'°O from both sediment 
cores CDH19 and MD95- 
2036 from the Bermuda Rise. 
The optimal correlation of 
5Cpr leads NGRIP 8'%0 
by approximately 2 centu- 
ries (Fig. 4). This lead is 
corroborated by Pa/Th phas- 
ing, which when consider- 
ing the century-scale response time of the proxy 
(13, 14) is consistent with AMOC changes indi- 
cated by 8Cpr. The SST reconstruction from 
MD95-2036 was aligned with Greenland 8'80, 
yielding a correlation of r? = 0.83 (31). SST and 
Pa/Th are synchronous with NGRIP to within 
the estimated bioturbation error of 8 cm within 
the core, displaying correlations with Greenland 
of r° = 0.47 for Pa/Th and r” = 0.65 for SST. The 
optimal correlation of %CaCOs, 7° = 0.64, lags 
NGRIP 810 by nearly 200 years. 

The consistent lead of variations in 5'Cpy be- 
fore SST and Greenland temperatures, repeated 
over multiple millennial cycles, indicates the po- 
tential influence of AMOC on NH climate and 
confirms that the Bermuda Rise is exposed to 
shifts in deep-water mass mixing. Initially, deep 
circulation changes, which is evidenced overall 
by the timing of 8'Cpp. Pa/Th shifts are essen- 
tially in tandem with regional temperature when 
circulation accelerates, and soon thereafter as it 
responds to weakening AMOC (19). Given the re- 
sponse time of Pa/Th to instantaneous shifts in 
North Atlantic overturning (73, 14), this also sug- 
gests that changes in AMOC precede regional 
temperature change, although the exact timing 
may have differed during cooling and warming 
phases. Both SST and Greenland temperature 
proxies lag the ocean circulation in a consistent 
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fashion, and in turn, these northern changes 
have been demonstrated to lead Antarctic tem- 
peratures (5). Calcium-carbonate concentration 
is the last of the proxies to respond to AMOC 
change, which is consistent with the longer time 
scale of preservation, dissolution, and dilution 
in the deep ocean. 

The relative timing of the observed AMOC 
changes has important implications for regional 
and global climate. Whereas numerous compu- 
ter simulations suggest that melting icebergs and 
other freshwater input associated with H events 
may have shut down NADW production (9, IJ, 27), 
recent results examining the phasing of North 
Atlantic SST and IRD suggest that stadial con- 
ditions began to develop before ice-rafting (32). 
The evidence here nevertheless indicates that 
the greatest AMOC reduction and the coldest 
stadial intervals accompanied the largest iceberg 
discharges. This suggests that the iceberg dis- 
charges may have provided a positive feedback 
mechanism to accelerate the initial cooling with- 
in each multimillennial climate cycle. In addi- 
tion, the extended H-stadial reductions in AMOC 
observed in this study coincide with intervals 
of rising atmospheric CO, (33), whereas CO, 
declined when AMOC increased during the sub- 
sequent sharp transitions to northern intersta- 
dials, supporting a potential influence on the 
atmosphere by the deep circulation on millen- 
nial time scales (34). 

The robust relationship of reductions in export 
of northern deep waters evident in reduced *?"Pa,; 
export and decreased 5”Cpy before and during 
stadial periods, and the dramatic increases in 
both during interstadials, provide direct evidence 
for the role of AMOC in abrupt glacial climate 
change. The sequence of marked circulation 
changes and northern hemisphere climate detailed 
here, combined with the demonstrated lag of 
Antarctic temperature variations (5), strongly im- 
plicates changes in meridional heat transport by 
the ocean as a trigger for abrupt northern hemi- 
sphere warming and the tipping of the “bipolar 
seesaw” (26). 
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Teaching accreditation exams reveal 
grading biases favor women in 
male-dominated disciplines in France 


Thomas Breda”?*+ and Mélina Hillion’*+ 


Discrimination against women is seen as one of the possible causes behind their 
underrepresentation in certain STEM (science, technology, engineering, and mathematics) 
subjects. We show that this is not the case for the competitive exams used to recruit 
almost all French secondary and postsecondary teachers and professors. Comparisons 
of oral non-gender-blind tests with written gender-blind tests for about 100,000 
individuals observed in 11 different fields over the period 2006-2013 reveal a bias in favor 
of women that is strongly increasing with the extent of a field’s male-domination. This bias 
turns from 3 to 5 percentile ranks for men in literature and foreign languages to about 
10 percentile ranks for women in math, physics, or philosophy. These findings have 
implications for the debate over what interventions are appropriate to increase the 
representation of women in fields in which they are currently underrepresented. 


hy are women underrepresented in most 

areas of science, technology, engineering, 

and mathematics (STEM)? One of the 

most common explanations is that a hiring 

bias against women exists in those fields 

(1-4). This explanation is supported by a few older 

experiments (5-7), a recent one with fictitious 

resumes (8), and a recent lab experiment (9), which 
suggest that the phenomenon still prevails. 

However, some scholars have challenged this 

view (JO, 11), and another recent experiment with 

fictitious resumes finds a bias in favor of women 

in academic recruitment (12). Studies based on 

actual hiring also find that when women apply to 

tenure-track STEM positions, they are more like- 

ly to be hired (13-18). However, those studies do 
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not control for applicants’ quality and a frequent 
claim is that their results simply reflect that only 
the best female Ph.D.’s apply to these positions, 
whereas a larger fraction of males do so (11, 13). A 
study by one of us did partly control for applicants’ 
quality and reported a bias in favor of women in 
male-dominated fields (79). However, it has limited 
external validity because it relies on only 3000 
candidates who took the French Ecole Normale 
Supérieure entrance exam. 

The present analysis is based on a natural 
experiment involving >100,000 individuals who 
participated in competitive exams used to hire 
French primary, secondary, and college or uni- 
versity teachers over the period 2006-2013. It 
has two distinct advantages over all previous 
studies. First, it provides large-scale real-world 
evidence of gender biases in evaluation-based 
hiring in several fields. Second, it shows that 
those biases against or in favor of women are 
strongly shaped by the actual degree of female 
underrepresentation in the field in which the 
evaluation takes place, which partly reconciles 
existing studies. 


Carefully taking into account the extent of 
underrepresentation of women in 11 academic 
fields allowed us to extend the analysis beyond 
the STEM distinction. As pointed out recently 
(11, 12, 19, 20), the focus on STEM versus non- 
STEM fields can be misleading for understand- 
ing female underrepresentation in academia, as 
some STEM fields are not dominated by men 
[e.g., 54% of U.S. Ph.D.’s in molecular biology are 
women (27)], whereas some non-STEM fields, in- 
cluding humanities, are male-dominated [e.g., only 
31% of U.S. Ph.D.’s in philosophy are women (27)]. 
A better predictor of this underrepresentation, 
some have argued, is the belief that innate raw 
talent is the main requirement to succeed in 
the field (20). 

To study how female underrepresentation 
can shape skills assessment, we exploit the two- 
stage design of the three national exams used 
in France to recruit virtually all primary-school 
teachers, CRPE middle- and high-school teach- 
ers, CAPES and Agrégation; as well as a large 
share of graduate school and university teach- 
ers, who also take the Agrégation (22). A college 
degree is necessary to take part in those com- 
petitive exams [table S1 in (22)]. Except for the 
lower level (CRPE), each exam is subject-specific 
and typically includes two or three written tests. 
The best candidates after those written tests 
(tables S2 and S3) are eligible for typically two 
or three oral tests taken no later than 3 months 
after the written tests (22). Note that oral tests are 
not general recruiting interviews: Depending 
on the subject, they include exercises, questions, 
or text discussions designed to assess candi- 
dates’ fundamental skills, exactly as written tests. 
Teachers or professors who have specialized 
in the subject grade all the tests. At the highest- 
level exam (Agrégation), 80% of evaluators are 
either full-time researchers or university pro- 
fessors in French academia. The correspond- 
ing statistic is 30% for the medium-level exam 
(CAPES). 

Our strategy exploits the “blinding” of the writ- 
ten tests (candidates’ name and gender are not 
known by the professors who grade these tests), 
whereas the oral tests are not blinded. If one as- 
sumes that female handwriting cannot be easily 
detected—which we discuss later—written tests 
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Fig. 1. Female evaluation 
advantage or disadvan- 
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provide a counterfactual measure of students’ 
cognitive ability in each subject. 

The French evaluation data offer unique ad- 
vantages over previously published experiments; 
they provide real-world test scores for a large 
group of individuals. Thus, they avoid the usual 
problem of experiments’ limited external validity. 
At the same time, these data present a compelling 
“experiment of nature” in which naturally occurr- 
ing variations can be leveraged to provide con- 
trols. A final advantage is being able to draw on 
very rich administrative data that allow numer- 
ous statistical controls to be applied and compar- 
isons to be made across levels of evaluation, from 
lower-level (primary and secondary teaching) to 
college or university hiring. 

To assess gender bias in evaluation, we focused 
on candidates who took all oral and written tests, 
and we ranked them according to their total 
score on either written or oral tests. We then com- 
pared the variation of women’s mean percentile 
rank between written and oral tests to the same 
variation for men. This standardized measure is 
bounded between -1 and 1, and it is independent 
of the share of females among the total pool of 
applicants. It is equal to 1 if all women are below 
the men on written tests and above them on oral 
tests [see (22) for additional explanations]. For 
each subject-specific exam, we computed this 
measure and its statistical significance using a 
linear regression model—named DD1 in (22)—of 
the type ARank; = a + bF; + €;. ARank; is the 
variation in rank between oral and written tests 
of candidate 7, F; is an indicator variable equal to 
1 for female candidates and 0 for males, ¢; is an 
error term, and 0 is the measure of interest. 
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In fields in which women are underrepresented 
(mathematics, physics, chemistry, and philosophy), 
oral tests favor women over men both on the 
higher-level exams (professorial and high-school 
teaching) and medium-level exams (secondary 
school teaching only) (Fig. 1) (Ps < 0.01 in all 
cases, see sample sizes and detailed results in 
table S4). In contrast, oral tests in fields in which 
women are well-represented (literature and foreign 
languages) favor men over women, but the differ- 
ences are smaller and not always significantly dif- 
ferent from 0 at the 5% statistical level (Fig. 1 and 
table S4). In history, geography, and social sci- 
ences, there are only small gender differences 
between oral and written tests. Those differences 
are not significantly different from O at the 5% 
statistical level. In biology, a bias against women 
is found on the high-level exam only. With the 
exception of social sciences at the medium-level 
exam (22), all results are robust to the inclusion of 
control variables and to the use of a more gen- 
eral econometric model that allows for differ- 
ent returns to candidates’ fundamental skills 
between oral and written tests [see models DD2 
and DD3+IV in (22)]. 

A simple explanation for these results would 
be that examiners on oral tests try to lower the 
gender difference in ability observed on written 
tests. This is not always the case (Fig. 2): The 
oral tests sometimes fully invert a significant 
ranking gap between women and men on writ- 
ten tests (physics at the highest level, math at 
the medium level). 

A clear pattern emerges from Fig. 1: The more 
male-dominated a field is, the higher the bonus 
for women on the nonblinded oral tests. To for- 


mally capture this pattern, we study how the bonus 
b on oral tests varies with the share of women s 
among assistant professors and senior professors 
in the French academy [see (22) for statistical 
details and other measures of fields’ feminiza- 
tion, e.g., table S5]. We find a significant negative 
relation at both the higher- and medium-level 
exams (see table S6) (6 = 0.25 - 0.53 s at the high- 
level exam; b = 0.13 - 0.28 s at the medium-level 
exam, with P < 0.02 for both slopes and inter- 
cepts of the fitted lines). 

The relation between the extent of a field’s 
male-dominance and female bonuses on oral tests 
at the highest-level exams (for high-school teach- 
ers and professorial) is about 150% of that at the 
medium-level exams. At the highest level, switch- 
ing from a subject as feminine as foreign languages 
(s = 0.62) to a subject as masculine as math (s = 
0.21) leads female candidates to gain, on average, 
17 percentile ranks on oral tests with respects to 
written tests. To avoid sample-selection bias, this 
comparison between the medium- and the high- 
level exam is made on a subsample of about 3500 
individuals who have taken both exams in the 
same subject the same year [(22), fig. $2, and 
table S6]. 

Finally, the statistical analysis suggests an 
absence of large significant gender biases on 
oral tests for the lower-level teaching exam (22). 
Note that this exam is not subject-specific. How- 
ever, since 2011, all applicants have been required 
to take both an oral and a written test in math 
and literature, which makes it possible to study 
the bonus on oral tests for women in those two 
subjects. We find a small premium of around 
3 percentile ranks for women on oral tests, both 
in math and literature, with no clear difference 
between those two subjects (see table S7). This 
finding should, however, be considered with 
prudence because it can only be established 
with the more general econometric specification 
[see model DD3+IV in (22)]. 

The differences between written and oral tests 
on the specialized medium- and high-level exams 
have implications for the gender composition of 
newly recruited secondary and postsecondary 
teachers. Oral tests give the gender in the mi- 
nority better chances of being hired (fig. S1) 
and, therefore, induce a rebalancing of gender 
ratios between teachers hired in male- and 
female-dominated fields (table S8). We also find 
that the gender gaps between oral and written 
tests are very stable across the written test score 
distribution in all fields for the medium- and 
high-level exams (table S9). 

Should the differences between written and 
oral test scores be interpreted as evaluation biases? 
In natural experiments, the researcher does not 
have full control on the research design; thus, 
the results usually need to be interpreted with 
caution. The setting we exploit has three po- 
tential issues: (i) gender may be inferred on writ- 
ten tests from handwriting; (ii) there might be 
gender differences in the types of abilities that are 
required on oral and written tests; and (iii) the 
way candidates self-select in a given field may 
depend on their gender. 
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Fig. 2. Average rank difference between women and men on oral and written tests in each subject- 
specific exam at the high and medium level. (A and B) Error bars indicate 95% confidence intervals 
from Student's t test. 
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Fig. 3. Female advantage or disadvantage on an oral test which is identical in all fields. The differ- 
ence between women’s and men’s average rank on the oral test “Behave as an ethical and responsible 
civil servant” in the different subject-specific medium-level exams (y axis). The size of each point 
indicates the extent to which it is different from O (P value from Student's t test). Any fields’ extent of 
(non—)male-domination measured by the share of women academics in each field (x axis). 


Tests that we previously conducted have shown | age, 68.6% (19). This suggests that examiners are 
that the rate of success in guessing gender from | rarely certain about the candidates’ gender from 
handwritten anonymous exam sheets is, on aver- written tests [see additional details in (22)]. Their 
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limited ability to detect the gender of candidates 
from the written tests would be truly problematic 
for the interpretation of our results if and only if 
those examiners were biased in opposite direc- 
tions on the written and oral tests. This assumption 
cannot be tested empirically but seems unlikely, 
given that the same examiners usually evaluate 
both the written and oral tests (22). Moreover, 
examiners’ bias is likely to be smaller when they 
face presumably female or male handwriting than 
when they are exposed to an actual female or 
male candidate during an oral test. Therefore, 
partial gender detection on written tests should, 
if anything, only attenuate the magnitude of the 
estimated biases, which would keep their direc- 
tion identified. 

A more fundamental issue is that the gap 
between a candidate’s oral and written test score 
in a given subject can capture the effect of gender- 
related attributes visible only from oral or written 
tests, such as the quality of handwriting, elocution, 
or emotional intelligence [see (23-26) for surveys 
on possible sex differences in cognitive abilities, 
including verbal fluency]. 

The first defense against those interpretations 
is that our key result is not the absolute gender 
gap in the oral versus written test scores in a given 
subject, but the variation—and even reversal—of 
this gap across subjects according to a regular 
pattern. If there are gender-specific differences 
in abilities required specifically for oral or writ- 
ten tests, these differences need to vary between 
male-dominated and other subjects to explain 
our results. For example, handwriting quality or 
elocution would both need to differ across gen- 
der and to be more rewarded for some subjects 
than for others. This could be true if the oral tests 
in the most male-dominated subjects are framed 
in a way that makes more visible the qualities 
that are more prevalent among women. 

To overcome these issues and a possible hand- 
writing detection problem, we exploit a remark- 
able feature of the teaching exams: since 2011, all 
of them have included an oral test entitled “Behave 
as an ethical and responsible civil servant” (BERCS). 
At the medium- and high-level exams, BERCS is 
the only test that is not subject- specific (27). This 
oral interview is a subpart of an oral test that 
otherwise attempts to evaluate the competence 
in the exam core subject. It is consequently 
graded by teachers or professors specialized in 
the exam core subject. 

We have data on detailed scores for the BERCS 
test for the lower- and medium-level exams (22). 
Comparisons of gender differences in perform- 
ance on this oral test across subjects for the 
medium-level exam reveals that women system- 
atically get better grades, and that this bonus 
b’ decreases with the share of women s in the 
exam’s overall subject area (Fig. 3, b’ = 0.12 - 
0.25 s, with P < 0.0001 for both the slope and the 
intercept, clustering by subjects). This pattern 
is similar to what is observed in Fig. 1 when 
comparing blind and nonblind subject-specific 
tests. However, the comparison across fields 
now relies on a single oral test that is identical 
in all exams. Consequently, the pattern in Fig. 3 
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Fig. 4. Female advantage or disadvantage on the BERCS test for candidates taking both the lower- 
level exam and a medium level exam in either a male-dominated or a gender-neutral field. Rank 
difference between women and men on the oral test “Behave as an ethical and responsible civil servant” 


at the lower-level exam and at the medium-level exa 


m among two samples of candidates: those who 


took both the lower- and medium-level exams in a strongly male-dominated subject (math, physics, or 
philosophy, left side, N = 60), and those who took both the lower- and medium-level exams in a more 
gender neutral subject (social sciences, history, geography, biology, literature, or foreign languages, right 
side, N = 120). To control for selection, ranks at the tests have been computed within each sample, 


ignoring other candidates that are not in the sample. 
square brackets. 


cannot be influenced by (i) handwriting detec- 
tion or by (ii) the fact that the oral and written 
tests evaluate different skills. Figure 3 also sug- 
gests that examiners favor women who chose to 
specialize in male-dominated subjects no matter 
what they are tested on. 

A last reason why our results could reflect 
skill differences is that (iii) the populations 
tested in the different subjects are not the same 
and are self-selecting. The women who decided 
to study math and take the math exams might 
be especially confident in math and perform 
better on oral tests for this reason, whereas the 
same happens for men in literature. Selection 
may also explain the results of the BERCS test: 
Women enrolled in the more male-dominated 
exams may have better aptitude for that par- 
ticular oral test. 

We can first reject that sample selection drives 
our results in a specific case: at the medium-level 
exam in physics-chemistry, the same candidates 
have to take the oral and written tests in both 
physics-chemistry. Among those candidates, the 
bonus for women on oral tests is 9 percentile 
points greater in physics than in chemistry, a 
subject that is less male-dominated according to 
all indicators. The idea that sample selection 
does not drive the general pattern in Fig. 1 is also 
confirmed by a previous analysis that is entirely 
based on identical samples of candidates being 
tested in different subjects (19). 

To control for sample selection in the BERCS 
test, we exploited a pattern(?) of test-taking over 
the period 2011-2013: A few candidates took both 
the lower-level exam and the medium-level exam 
in a specific subject. We used the grade obtained 
from the BERCS test for the lower-level exam 
(where this test is also mandatory and graded 
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Confidence intervals at the 90% level are given in 


as a subpart of the literature test) as a counter- 
factual measure of ability. As the lower-level 
exam is not subject-specific, it offers a counter- 
factual measure in a gender-neutral context. 
Among the small group of candidates who took 
both exams and took the medium-level exam in a 
less male-dominated subject (social sciences, 
history, geography, biology, literature, or for- 
eign languages), men get an advantage over 
women on the oral test BERCS that is signifi- 
cantly higher at the 5% level for the medium- 
level exam than for the lower-level exam (Fig. 4, 
P = 0.04, N = 120 candidates). The reverse is true 
(however, not statistically significant) among the 
group that took the medium-level exam in a male- 
dominated subject (math, physics-chemistry, or 
philosophy, NV = 60). As both the test subject 
and the sample of candidates are held constant 
in this last experiment, observed differences 
almost surely reflect examiners’ bias according 
to the extent of male-domination in the candi- 
dates’ field of specialization. 

In total, the various empirical checks provided 
here imply with high confidence that our results 
for the medium- and higher-level exams reflect 
evaluation biases rather than differences in candi- 
dates’ abilities. These biases rebalance gender 
asymmetries in academic fields by favoring the 
minority gender. For women, this runs counter 
to the claim of negative discrimination in re- 
cruitment of professors into math-based fields. 
If anything, women appear to be advantaged in 
those fields. In contrast, men appear to be ad- 
vantaged in recruitment into the most feminized 
fields. Those behaviors are stronger on the highest- 
level exams, where candidates are more skilled, 
and where initial gender imbalances between the 
different fields are largest (see table S2). 


Our results are compatible with two main 
mechanisms. First, evaluators may have differ- 
ent beliefs about female and male applicants 
in the different fields and may statistically dis- 
criminate accordingly. For example, females who 
have mastered the curriculum, and who apply 
for highly skilled jobs in male-dominated fields 
may signal that they do not elicit the general 
stereotypes associating quantitative ability with 
men. This may induce a rational belief reversal 
regarding the motivation or ability of those fe- 
male applicants (28), or a so-called “boomerang 
effect” (29) that modifies the attitudes toward 
them. Experimental evidence provides support 
for this theory by showing that gender biases are 
lower or even inverted when information clearly 
indicates high competence of those being eval- 
uated (29, 30). Second, evaluators may simply 
have a preference for gender diversity, either 
conscious (e.g., political reasons) or unconscious. 
Evidence shows that evaluation biases in favor 
of the minority gender in a given field are larger 
in years where this gender performs more poorly 
at written tests (table S10). This result, which 
should not be overinterpreted (22), tends to re- 
ject the first explanation and is consistent with 
the second one. 

Finally, for the math medium-level exam [the 
only one for which we have data on jury com- 
position (table S11)], we find no evidence that 
male (with respect to female) examiners sys- 
tematically favor female (with respect to male) 
candidates (table S12). This result is in line with 
previous research (12, 19, 31) and suggests that 
context effects (surrounding gender stereotypes) 
are more important than examiners’ gender in 
explaining gender biases in evaluation. It ex- 
cludes that between-fields variation in panel 
composition drives our results. We also checked 
(on the subsample for which we have detailed 
information) that examiners’ teaching levels do 
not affect their preferences and conclude that 
the higher proportion of assistant professors and 
professors who judge the higher-level exam can- 
not explain the stronger bonus obtained by the 
minority gender at that level. 

Even without being fully conclusive on the 
underlying mechanisms, the presented analy- 
ses shed light on the possible causes of the under- 
representation of women in many academic fields. 
They confirm evidence from a recent experiment 
with fictitious resumes (12) that women can be 
favored in male-dominated fields at high recruit- 
ing levels (from secondary school teaching to 
professorial hiring), once they have already spe- 
cialized and heavily invested in those fields 
(candidates on teaching exams hold at least a 
college or a master’s degree) (32). In contrast, 
the study of the recruiting process for primary 
schoolteachers suggests that prowomen biases 
in male-dominated fields may disappear in less 
prestigious and less selective hiring exams, where 
candidates are not necessarily specialized. Perhaps 
the bias in favor of women in male-dominated 
fields would even reverse at lower recruiting 
levels, as in experiments done with medium- 
skilled applicants (8, 9). Discrimination may 
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then still impair women’s chances to pursue a 
career in quantitative science (or philosophy), 
but only at the early stages of the curriculum, 
before or just as they enter the pipeline that leads 
to a Ph.D. or a professorial position. 

Nevertheless, there is no compelling evidence 
of hiring discrimination against individuals who 
have already decided against social norms to pur- 
sue an academic or a teaching career in a field 
where their own gender is in the minority. This 
result has three consequences for policy. First, 
active policies aimed at counteracting stereo- 
types and discrimination should probably focus 
on students at early ages, before educational 
choices are made. Second, nonblind evaluation 
and hiring should be favored over blind-evaluation 
in order to reduce gender imbalances across aca- 
demic fields. In particular, policies that impose 
anonymous curricula vitae in the first stage of 
academic hiring are likely to have effects opposite 
to those expected. Third, many women may shy 
away from male-dominated fields at early ages 
because they believe that they would suffer from 
discrimination. Advertising that they have at least 
as good—or even better—opportunities as their 
male counterparts at the levels of secondary 
school teaching and professorial recruiting could 
encourage talented young women to study in 
those fields. 
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PARASITIC PLANTS 


Detection of the plant parasite 
Cuscuta reflexa by a tomato cell 


surface receptor 


Volker Hegenauer,’* Ursula First,’ Bettina Kaiser,’* Matthew Smoker,” Cyril Zipfel,” 


Georg Felix,’ Mark Stahl,’ Markus Albert’ + 


Parasitic plants are a constraint on agriculture worldwide. Cuscuta reflexa is a stem holoparasite 
that infests most dicotyledonous plants. One exception is tomato, which is resistant to 

C. reflexa. We discovered that tomato responds to a small peptide factor occurring in Cuscuta spp. 
with immune responses typically activated after perception of microbe-associated molecular 
patterns. We identified the cell surface receptor-like protein CUSCUTA RECEPTOR 1 (CuRel) as 
essential for the perception of this parasite-associated molecular pattern. CuRel is sufficient to 
confer responsiveness to the Cuscuta factor and increased resistance to parasitic C. reflexa when 
heterologously expressed in otherwise susceptible host plants. Our findings reveal that plants 
recognize parasitic plants in a manner similar to perception of microbial pathogens. 


long with microbial pathogens and herbiv- 

orous arthropods, parasitic plants repre- 

sent an additional class of threat to crops 

(1). As many as 4000 species, belonging 

to more than 20 plant families, have been 
classified as parasitic plants; hence, the switch 
to a parasitic lifestyle occurred independently and 
on several occasions during evolution (2). 

The plant genus Cuscuta (dodder) comprises 
about 200 species, all of which live as obligate 
stem holoparasites with broad host spectra (3-6). 
Germinating Cuscuta seedlings sense plant vola- 
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tiles and direct their growth toward their host 
(7). Initial contact induces the formation of haus- 
toria (8), specialized structures that attach, pen- 
etrate, and connect to the vascular bundles of 
their hosts. Once connected, Cuscuta parasites 
withdraw water, nutrients, and carbohydrates 
(5, 9-11) from host plants, and also exchange 
macromolecules such as proteins and RNAs, as 
well as viruses, in a bidirectional manner (12-17). 

Most susceptible plants lack efficient defense 
systems to ward off C. reflexa. However, the cul- 
tivated tomato (Solanum lycopersicum) is resist- 
ant to C. refleva and exhibits a hypersensitive 
response to attempted penetration by C. reflexa 
haustoria (18-22) (Fig. 1A). We asked whether 
tomato might detect and respond to molecular 
signals associated with the parasitic plant in a 


sciencemag.org SCIENCE 


RESEARCH | REPORTS 


manner comparable to the response of plants to 
microbe-associated molecular patterns (MAMPs). 
We tested extracts of C. reflexa for their ability to 
induce release of reactive oxygen species (ROS) 
and to trigger synthesis of the stress-related phy- 
tohormone ethylene. Indeed, C. reflewa extract 
triggered both responses in S. lycopersicum but 
not in susceptible plants, including the related 
Solanaceae Nicotiana tabacum, N. benthamiana, 
S. tuberosum, and the wild tomato species 
S. penneliti (Fig. 1B and fig. S1). 

Initial characterization showed that the fac- 
tor present in the C. reflexa extract is heat-stable 
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at 95°C but is sensitive to treatment with pro- 
teases (Fig. 1C). Checking for putative secondary 
modifications, we observed that enzymatic de-N- 
glycosylation had no influence on its activity, 
whereas treatment with ammonia, a procedure 
known to remove ester-type modifications such 
as sugar side chains from peptide backbones 
(22), led to a loss of functionality (Fig. 1D). The 
Cuscuta factor appeared to be constitutively pres- 
ent in all parts of C. reflexa, including shoot 
tips, stems, haustoria, and, at lower levels, in 
flowers (fig. S2A), indicating that this factor is 
not produced only at certain developmental or 
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Fig. 1. S. lycopersicum (cultivated tomato) shows defense responses to C. reflexa and to extracts 
thereof. (A) Left: C. reflexa cannot form connections to S. lycopersicum and dies off. Right: C. reflexa on the 
susceptible host S. pennellii. Photos were taken ~14 days after parasite onset. (B) C. reflexa extract triggers 
ethylene biosynthesis in S. lycopersicum but not in other plant species. Bovine serum albumin (BSA) buffer in 
25 mM MES buffer, pH 5.7 (0.01 mg/ml) was added as mock control; Penicillium extract (0.05 mg/ml) 
served as positive control (38). FW, fresh weight. (C and D) Characteristics of the Cuscuta factor present in 
the C. reflexa extract. (C) Ethylene biosynthesis of tomato leaf pieces to C. reflexa extract, to boiled extract 
(95°C, 30 min), or to extract pretreated with the proteases indicated. (D) Ethylene response to different 
doses of the Cuscuta factor after enzymatic de-N-glycosylation or to Cuscuta factor treated with 20% 
NH,OH (45°C, 16 hours), respectively. (E) Ethylene response of tomato leaf pieces triggered by extracts 
of other Cuscuta species or by extracts of other plants. In (B) to (E), ethylene measurements show means 
of three technical replicates; error bars denote SD. All experiments were repeated more than three times. 
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infectious stages. Activity was apparently as- 
sociated with the cell walls of the parasite, from 
which it could be released by acidic conditions 
(fig. S2B). 

We observed induction of ethylene production 
in S. lyopersicum with extracts from six differ- 
ent Cuscuta species but not with extracts from 
A. thaliana, N. benthamiana, or S. lycopersicum 
(Fig. 1E). Also inactive were extracts from Calystegia 
sepium (hedge bindweed), a nonparasitic spe- 
cies of the Convolvulaceae related to Cuscuta, 
and Rhinanthus alectorolophus, a hemiparasitic 
flowering plant of the Orobanchaceae, which in- 
fects roots of many herbaceous plants (Fig. 1E). 
Thus, the active factor seems to be common to 
Cuscuta species but absent from plants outside 
this genus. 

To purify and identify the Cuscuta factor, 
we established a purification scheme involving 
sequential separation steps (fig. S3A). Prepuri- 
fied C. reflexa extract was first separated by cation 
exchange chromatography, where activity eluted 
as several peaks indicating heterogeneity with re- 
spect to charge (fig. S3B). Active fractions of “peak 
2” (fig. S3B) were pooled and further purified by 
reversed-phase chromatography (RPC) on C18 
material using different pH conditions. This ac- 
tivity further split into different peaks and frac- 
tions (fig. S3C); this indicates that the activity, 
rather than representing a single defined mole- 
cule, is associated with a range of physicochem- 
ically heterogeneous compounds present in the 
Cuscuta extract. Although this heterogeneity dis- 
persed activity to numerous subfractions, we 
succeeded in purifying a single molecule with a 
molecular mass of 2262.79 Da that correlated 
with activity in elution from the final RPC used 
for liquid chromatography-mass spectrometry 
(MS) analysis (fig. S3D). However, we did not 
obtain conclusive fragmentation patterns from 
tandem MS/MS analysis of this molecule in several 
attempts (fig. S4). Apart from the low amount of 
this particular form of the Cuscuta factor, this 
might be attributable to the yet unidentified mod- 
ification present on the peptide. Nonetheless, 
our data suggest that the Cuscuta factor is as- 
sociated with a small, potentially modified (e.g., 
O-glycosylated) peptide that is characteristically 
present in extracts from Cuscuta spp. 

We exploited the natural variation between 
susceptible S. pennellii (23) and resistant 
S. lycopersicum to identify the receptor for the 
Cuscuta factor. We used an introgression library 
of S. lycopersicum x S. pennellii (24) to map 
genomic regions essential for the differential 
response to the Cuscuta factor. The collection 
of 49 introgression lines (ILs) included chromo- 
some fragments of S. pennellii covering ~98% of 
the tomato genome (25). Only line IL8-1 was un- 
responsive to the Cuscuta factor (Fig. 2A). Fur- 
ther mapping with sublines IL8-1-1 and IL8-1-5 
(24) (Fig. 2B) identified a chromosome region 
termed bin d8-B (Fig. 2C) (25), which has 822 an- 
notated genes. Only five of these genes are pre- 
dicted to encode cell surface receptor-type proteins 
(25) that could perceive the Cuscuta factor. We 
individually expressed these candidate genes, 


29 JULY 2016 * VOL 353 ISSUE 6298 479 


Downloaded from http://science.sciencemag.org/ on July 28, 2016 


RESEARCH | REPORTS 


which encode three leucine-rich repeat receptor- 
like proteins (LRR-RLPs) and two receptor-like 
kinases (RLKs), in N. benthamiana, a species 
lacking an endogenous detection system for the 
Cuscuta factor (Fig. 1B and fig. $1). Four of these 
candidates had no effect, but N. benthamiana 
leaves expressing Solyc08g016270 responded 
to the Cuscuta factor with increased ethylene 
biosynthesis (Fig. 2D) and an oxidative burst 
(Fig. 3A). Dose dependence of response (Fig. 3B) 
showed half-maximal stimulation with Cuscuta 
factor at an estimated concentration of <0.3 nM. 
Thus, the protein encoded by Solyc08g016270 is 
sufficient to confer sensitive responsiveness spe- 
cific for the Cuscuta factor, and we termed it 
CuRel (Cuscuta receptor 1). To corroborate its 
function as a genuine receptor that directly 
interacts with the Cuscuta factor as a ligand, we 
tested whether immunoprecipitates of CuRel 
could specifically retain Cuscuta factor when 
incubated with Cuscuta extract. As controls, we 
used similar immunoprecipitates obtained from 
N. benthamiana leaves expressing the receptor 
kinase EFR (26) and the LRR-RLP AtRLP23 (27) 
from Arabidopsis. Cuscuta factor, assayed by 
the ethylene induction assay in tissue expressing 
CuRel, was reproducibly detected in immuno- 
precipitates with CuRel but not with control 
receptors or empty beads (Fig. 3C). 

Because activity in Cuscuta extracts sepa- 
rates into distinct subfractions during purifica- 
tion (fig. S3B), we tested these different forms 
of the Cuscuta factor for bioactivity in CuRel- 
expressing N. benthamiana plants. Samples from 
all subfractions (fig. S3B) induced clear ethylene 
responses in a CuRel-dependent manner, in- 
dicating that the Cuscuta factor, although he- 
terogeneous in structure, triggers CuRel via a 
common active principle. Similarly, we con- 
firmed that the extracts of other Cuscuta species 
(Fig. 1E) also induced ethylene biosynthesis via 
CuRel (fig. S5B). 

CuRel encodes a typical LRR-RLP that com- 
prises an N-terminal signal peptide for export via 
the endoplasmic reticulum, a large LRR ectodo- 
main with 30 to 32 LRRs and 18 potential N- 
glycosylation sites, a single transmembrane helix, 
and a short cytoplasmic tail (fig. S6). Full-length 
CuRe] is represented in S. lycopersicum genomic 
DNA and cDNA but is absent from S. pennellii 
and from IL8-1-1 (fig. S7). This is in accordance 
with available genomic sequencing data showing 
that only truncated forms of CuReJ—annotated 
as Sopen08g00656 and Sopen08g006740—are 
present in S. pennellit (28). The closest relatives 
of CuRel in S. lycopersicum, with amino acid 
sequence identities of 82% and 72%, respective- 
ly (Solyc08g016210 and Solyc08g016310), were 
found to be in close proximity to CuReJ on chro- 
mosome 8 but were unable to initiate ethylene 
production in response to the Cuscuta factor 
when transiently expressed in N. benthamiana 
(Fig. 2D). CuRel-like sequences with similar amino 
acid sequence identities of ~70 to 80% can also 
be found in other Solanaceae but not in species 
outside this family. However, the CuRel-related 
genes present in V. benthamiana and S. tuberosum 
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Fig. 2. Mapping of responsiveness to C. reflexa extracts in tomato. (A) Tomato introgression lines 
of S. lycopersicum x S. pennellii were screened for ethylene production in response to C. reflexa extract; BSA/buffer 
was used as mock control. (B) Ethylene response of additional ILs related to tomato chromosome 8. (C) Graphical 
scheme of the mapped chromosome region bin d8-B as modified from (25). (D) Receptor candidate genes 
encoded within bin d8-B were expressed in N. benthamiana. Ethylene production was measured after treatment 
with C. reflexa extract, Penicillium extract, or BSA as indicated. Data are means + SD of n = 3 replicates. 


seem not to be sufficient to confer responsive- 
ness to the Cuscuta factor in these species (Fig. 
1B and fig. S1). 

Plant receptor-like proteins lack cytoplasmic 
kinase domains for signaling output and, in 
general, seem to depend on adaptor kinases of 
the SOBIR1 (suppressor of BAK1-interacting 
receptor kinase) type (27, 29-33). Coimmuno- 
precipitation analysis with tagged versions of 
CuRel and SISOBIR1 or SISOBIR1-like from 
S. lycopersicum (tomato) showed constitutive 
interaction of CuRel with both of these adap- 
tor kinases (Fig. 3D). As for other RLPs, such as 
tomato Cf-9 and Cf-4 or A. thaliana AtRLP23, 
AtRLP30, and ReMax/AtRLP1 (27, 29, 30, 33, 34), 
formation of the complex between CuRel and 
SISOBIRI occurred irrespective of the presence 
or absence of the Cuscuta factor as stimulus 
(Fig. 3D). 


To check for the biological function of CuRel, 
we stably transformed CwRel constructs into 
S. pennellii and N. benthamiana, which are 
usually insensitive to the Cuscuta factor and 
susceptible to C. reflexa attack (Fig. 1B and fig. 
SI) (23, 35). Transformed lines of S. pennellit 
and N. benthamiana plants gained responsive- 
ness to Cuscuta factor (figs. S8 and S9) and 
exhibited increased resistance to C. reflexa in- 
festation (Fig. 3, E and F). Thereby, the process 
of parasite ingrowth seems disturbed, as hyper- 
sensitive response symptoms were visible at 
haustoria penetration sites on the host (fig. S10). 
Thus, CuRel from tomato improves resistance 
to C. reflexa attack in both the closely related 
species S. pennellit and the more distant spe- 
cies N. benthamiana. 

Full resistance of tomato against C. reflexa 
seems to require more than CuRel and perception 
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CuRel; pulldown against the C-terminal green fluorescent protein tag present 
at CuRel. Proteins were coexpressed in N. benthamiana, and samples were 
treated with Cuscuta factor (+; 1:100 diluted in water) or water alone (—) as 
control. (E) Growth of C. reflexa shoots on S. pennellii plants transformed with 
CuRel (T; generation) or nontransformed wild-type (wt) controls during 14 days 
of infestation with one C. reflexa shoot [15 cm in length, ~0.6 g FW] per host 
plant. Red diamonds represent weight of individual C. reflexa shoots. Box 
plots show median values of n = 12 replicates. **Pag) = 0.0015 (Tukey honestly 
significant difference test). (F) Growth of C. reflexa shoots on N. benthamiana 
plants stably transformed with CuRel (homozygous T2 generation) or non- 
transformed wild-type controls during 21 days of C. reflexa infestation. Ex- 
perimental conditions and data evaluation were as in (E). Triangles mark 
outliers not included in analysis. **P < 0.005 (Student t test). Data pres- 
ented in (E) and (F) are representative of three independent repetitions, each 
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PLANT ECOLOGY 


Rapid evolution accelerates plant 
population spread in fragmented 
experimental landscapes 


Jennifer L. Williams,’ Bruce E. Kendall,” Jonathan M. Levine* 


Predicting the speed of biological invasions and native species migrations requires an 
understanding of the ecological and evolutionary dynamics of spreading populations. 
Theory predicts that evolution can accelerate species’ spread velocity, but how landscape 
patchiness—an important control over traits under selection—influences this process is 
unknown. We manipulated the response to selection in populations of a model plant 
species spreading through replicated experimental landscapes of varying patchiness. 
After six generations of change, evolving populations spread 11% farther than 
nonevolving populations in continuously favorable landscapes and 200% farther in 

the most fragmented landscapes. The greater effect of evolution on spread in patchier 
landscapes was consistent with the evolution of dispersal and competitive ability. 
Accounting for evolutionary change may be critical when predicting the velocity 


of range expansions. 


n an era of global environmental change, 

biological invasions and the movement of 

species ranges with climate change present 

two of the greatest disruptions to natural 

and managed ecosystems (J, 2). At the core of 
each dynamic is the spread of populations across 
landscapes fragmented by natural and anthro- 
pogenic barriers to movement. It has long been 
appreciated that habitat fragmentation slows 
the velocity of spread (3, 4), but its influence on 
the potential for evolution to increase popula- 
tion expansion is unknown (5). Theory shows 
that natural selection at the low-density front 
of populations expanding through continuously 
favorable landscapes, coupled with the spatial 
sorting of offspring, favors traits contributing to 
fecundity and dispersal, both of which acceler- 
ate the invasion velocity (6-10). Whether this eco- 
evolutionary process operates similarly in systems 
fragmented by unsuitable habitat is uncertain be- 
cause spread in these systems depends on the 
buildup of high-density populations capable of 
dispersing over gaps (5, 17). Although any factor 
that alters selection on an expanding population 
can influence spread, whether evolution operating 
through selection or genetic drift predictably af- 
fects spread velocity on the rapid time scale of 
ecological dynamics remains to be determined. 
Answering questions about how evolution affects 
population expansion has important implications 
for predicting the future spread of biological in- 
vasions and climate change migrants, based on 
currently measured rates. 
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Empirical progress toward understanding evo- 
lution in populations spreading through frag- 
mented landscapes is limited, largely because the 
process occurs over many generations and at geo- 
graphic spatial scales. Due to these constraints, 
nearly all empirical evidence for evolution affect- 
ing spread comes from a few retrospective, ob- 
servational analyses (12-16). The spread velocity 
of cane toads, for example, increased by a factor 
of 5 after the species was introduced to Australia, 
consistent with evolved changes in dispersal 
(14, 17, 18). Nonetheless, with stochastic events 
contributing to the ecological and evolutionary 
trajectories of spreading populations (5, 19-21), 
replicated, controlled studies are necessary for 
understanding the predictability of this eco- 
evolutionary dynamic (15). Given the challenges 
of replicating invasions in the field and doing so 
in landscapes of varying fragmentation, model 
laboratory systems present an excellent opportu- 
nity to evaluate how evolution affects the speed 
at which populations expand through habitats 
of varying patchiness. 

We manipulated evolution in populations of 
the model plant Arabidopsis thaliana spreading 
through continuous and fragmented landscapes, 
each consisting of a linear array of rectangular 
pots (Fig. 1A) (22). We initiated each replicate in- 
vasion in the leftmost pot of the array by sowing 
equal fractions of 14 genotypes (recombinant 
inbred lines), which varied in spread-relevant 
traits. Due to nearly complete self-pollination 
of A. thaliana (23), the 14 genotypes can be treated 
as clones (24), facilitating our measurements of 
evolutionary change. In evolving populations, the 
resulting plants produced seeds, which dispersed 
across the array (assisted via a simulated rain 
event), constituting the next generation of the 
population (Fig. 1B). In nonevolving treatments, 
germinants emerging in the next generation 
were replaced with individuals randomly drawn 


from the initial seed pool, thus maintaining 
population dynamics while eliminating any 
change in the frequency or spatial sorting of 
genotypes. We manipulated habitat patchi- 
ness by separating individual pots of suitable 
habitat by gaps that were O (continuous land- 
scapes), 4, 8, or 12 times the mean dispersal 
distance. This protocol was repeated over six 
generations of spread, at which point individu- 
als at the leading edge and back of the invasions 
were genotyped, and traits of all 14 genotypes 
were measured. 

We found that after six generations of spread 
in continuous landscapes, evolving populations 
spread a modest 11% farther than nonevolving 
populations (Fig. 2A), a difference that was only 
marginally significant (43, = -2.05, P = 0.060). 
By contrast, in experimental landscapes with gaps 
12 times the mean dispersal distance, evolving 
populations spread three times as far as their 
nonevolving counterparts (Fig. 2D) (40.4 = -3.36, 
P = 0.007), leading to a significant gap size by 
evolution interaction (F,,7. = 10.77, P = 0.002). 
The effects of evolutionary change were so strong 
in patchy landscapes that evolving populations 
showed no significant reduction in velocity as the 
size of gaps increased from 4 to 8 to 12 times the 
mean dispersal distance (generation-six location 
of dark green line in Fig. 2, B to D) (Fi25 = 0.014, 
P = 0.908), even as velocity slowed in the non- 
evolving populations (Fi... = 8.52, P = 0.007). 
Patchiness and evolutionary change also influ- 
enced the among-replicate variability in expansion 
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Fig. 1. Spread of A. thaliana in experimental 
greenhouse arrays. (A) Leading edge of an in- 
vasion of a continuous landscape. (B) Spread in a 
continuous landscape for one replicate in the evolving 
treatment. Each colored line represents a succes- 
sive generation (pink, founding population; red to 
purple from left to right, first to sixth generation of 
spread). Points show abundance in the individual 
pots that make up the arrays. 
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velocity (Fig. 2). The coefficient of variation for 
spread was four times greater in the patchiest 
landscapes than in the continuous ones (fig. $1), 
consistent with a spread process driven by in- 
frequent long-distance dispersal events in frag- 
mented systems. We also found that evolving 
populations showed significantly less among- 
replicate variation in spread than nonevolving 
populations (fig. S1). Thus, despite the theoretical 
expectation for greater genetic drift at the leading 
edge of spreading populations (25), invasion speed 
was more predictable in evolving populations. 

One explanation for the greater effects of evo- 
lutionary change on spread velocity in patchier 
landscapes might be faster evolution due to 
stronger selection in these systems. However, 
the extent of genotypic change did not differ 
significantly with gap size (fig. S2 and table S1; 
Fig. 3 shows the initial and final genotypic com- 
positions), and the extent of trait change in- 
creased only marginally with increasing gap size 
(Fig. 3, fig. S2, and table S1). In fact, trait and 
genotypic change occurred in populations spread- 
ing through all landscape types, irrespective of 
whether evolution enhanced the spread veloc- 
ity (significant intercepts in the fitted models of 
table S1). These evolutionary changes reflect the 
combined effects of selection and drift. In the 
continuously favorable landscapes in particular, 
we found more among-replicate variation in the 
genotypic composition of leading individuals than 
expected by chance (fig. S3), consistent with spatial 
priority effects where genotypes that initially 
got ahead due to chance dispersal were able 
to stay ahead (5, 25). 

Despite similarities in the extent of trait and 
genotypic change across gap sizes, landscape 
patchiness affected the direction of evolution. 
Height and the average distance of the farthest 
dispersed seed, traits correlated with one another 
(Spearman rank correlation coefficient 7, = 0.55, 
P = 0.046), increased with landscape patchiness 
(backward and rightward shift of the replicates 
with increasing patchiness in Fig. 3; P = 0.008 and 
0.060, respectively, Table 1). These trait changes 
were associated with changes in the genotypic 
composition of the leading individuals with in- 
creasing patchiness (Fig. 3) (F\34 = 2.54, P = 0.042). 
Considering theory showing that greater dispersal 
increases the invasion velocity (6-10), the evolution 
of greater height and dispersal in patchier systems 
is consistent with the greater effects of evolution 
on spread in these landscapes. Nevertheless, 
whether landscape patchiness selected directly 
for better dispersal or indirectly via unmeasured 
traits that are correlated with dispersal remains 
an open question. 

Increased competitive ability probably also 
contributed to the greater effects of evolutionary 
change on spread velocity in patchier systems. 
Although competitive ability evolved to the same 
extent regardless of gap size [upward shift of 
replicates (Fig. 3 and Table 1); a similar result 
was found for seed mass (Table 1)], theory (5, 1D) 
predicts that increasing competitive ability will 
have a greater effect on spread in fragmented 
versus continuously favorable landscapes (fig. S4: 
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shows this result applied to our system). In frag- 
mented habitats, individuals often compete at 
crowded invasion fronts, enabling genotypes that 
make more offspring at high density (ie., better 
competitors) to spread faster (5, 17) (fig. S4). 
Though weaker, this effect also emerges in models 
of finite populations in continuously favorable 
landscapes (fig. S4) (26), consistent with the 
evolving populations moving modestly farther 
than the nonevolving populations in continu- 
ous landscapes (Fig. 2A). 

Extrapolating our results to wild populations 
requires care for several reasons. First, the focal 
populations were effectively asexual, meaning 
that trait variation was not continuous and traits 
were perfectly linked. Nonetheless, it is not clear 
how more continuous variation or less linkage 
between traits would influence the effect of evo- 
lutionary change on spread velocity. Second, al- 
though we manipulated genetic change in this 
experiment, we cannot rule out the influence of 
maternal and epigenetic effects on our results. 
Third, we explored the effects of fragmenta- 
tion, assuming it has no influence on the initial 
pool of genetic variation. If fragmentation in the 
nonspreading portion of a species range was to 
select for reduced dispersal (16, 27), then popula- 


A Continuous 


tions that spread from such sources might have 
less genetic variation in dispersal-related traits, 
limiting the response to selection. Related to this 
point, the effects of evolution in our study arose 
through drift and selection on standing variation; 
our results do not bear on the rates of evolution 
resulting from the rise of novel mutations. 

Our results demonstrate that evolution on 
ecological time scales can increase the speed 
of advance in spreading populations, and markedly 
so in the most patchy landscapes. However, fur- 
ther studies are needed to evaluate whether 
patchiness per se generally selects for traits that 
increase spread (24). Our results for less patchy 
landscapes show that large evolutionary changes 
in spreading populations can have little or no 
consequence for spread velocity. More generally, 
our findings add a more process-focused per- 
spective to past work that has shown either 
accelerating invasion fronts consistent with evo- 
lution (13-15, 17) or trait differences between 
individuals at the front and back of spreading 
populations (18, 28, 29). We conclude that ac- 
counting for evolutionary change on ecological 
time scales may be critical when predicting the 
rate at which biological invasions and climate 
change migrants reach new locations. 
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Fig. 2. Farthest distance colonized in each generation. Distance moved by evolving (thin green 
solid lines) and nonevolving (thin gray dashed lines) replicate invasions and their mean values (thick 
green and black lines, respectively) in landscapes that are (A) continuous or separated by gaps that 
are (B) 4, (C) 8, and (D) 12 times the mean dispersal distance. Lines in the three patchy landscapes 


are jittered for visibility. 
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Fig. 3. Genotypes and traits at the invasion fronts. The 
central pinwheel of each panel depicts the equal frequency 
of genotypes in the founding population and is located at 
the mean trait rank for three spread-relevant traits: 
competitive ability (dominance in nonspreading con- 
text), dispersal (average distance of farthest dispersed 
seed from a solitary individual), and plant height. Pies 
show the genotypic composition of the 10 leading 
individuals for each replicate invasion after six gen- 
erations of spread through landscapes that are (A) con- 
tinuous or separated by gaps that are (B) 4, (C) 8, and 
(D) 12 times the mean dispersal distance. The location of 
each replicate is given by the genotype-weighted trait 
rank mean (22). A fourth trait, seed mass, also evolved, 
but its evolution did not vary with landscape patchiness 
and is not shown here. The central panel shows trait 
ranks of the 14 genotypes; numbers indicate geno- 
type identity. 
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Table 1. Evolution of spread-relevant traits as a function of landscape patchiness. Results of 
linear models examining the change in height, dispersal, competitive ability, and seed mass at the 
invasion front after six generations of evolution as a function of landscape patchiness (size of gaps 
between suitable habitat). Trait change was measured as the difference between the genotype-weighted 
trait rank for each replicate (N = 36) and 7.5 — the mean trait rank of 14 genotypes in the founding population. 
Significant slopes indicate that the amount of change in the trait increased with increasing gap size (units of 
mean dispersal distance). Significant intercepts indicate that the trait changed significantly from the founding 
population, even in continuous landscapes. For competitive ability and seed mass, two traits with non- 
significant slopes, zero-slope models yielded highly significant intercepts (P < 0.001). Est., estimated value. 


Change in genotype Intercept 
weighted trait 
rank of Est. t 


Competitive ability 3.36 3.74 

Seed mass 1.29 Ds) 
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PLANT DEVELOPMENT 


Arabidopsis transcriptional repressor 
VALI triggers Polycomb silencing at 
FIC during vernalization 


Julia I. Qiiesta,’ Jie Song,”?* Nuno Geraldo,’ Hailong An,’} Caroline Dean*t 


The determinants that specify the genomic targets of Polycomb silencing complexes are 
still unclear. Polycomb silencing of Arabidopsis FLOWERING LOCUS C (FLC) accelerates 
flowering and involves a cold-dependent epigenetic switch. Here we identify a single point 
mutation at an intragenic nucleation site within FLC that prevents this epigenetic switch 
from taking place. The mutation blocks nucleation of plant homeodomain—Polycomb 
repressive complex 2 (PHD-PRC2) and indicates a role for the transcriptional repressor 
VALI in the silencing mechanism. VAL1 localizes to the nucleation region in vivo, promoting 
histone deacetylation and FLC transcriptional silencing, and interacts with components of 
the conserved apoptosis- and splicing-associated protein (ASAP) complex. Sequence- 
specific targeting of transcriptional repressors thus recruits the machinery for PHD-PRC2 


nucleation and epigenetic silencing. 


n Arabidopsis thaliana, prolonged cold ex- 

posure during winter promotes flowering 

through epigenetic silencing of FLOWERING 

LOCUS C (FLC) in a process called vernaliza- 

tion (J, 2). Cold exposure induces expression of 
antisense transcripts to FLC (collectively known as 
COOLAITR) (3) and a plant homeodomain (PHD) 
protein called VERNALIZATION INSENSITIVE 
3 (VIN3) (4). COOLAIR facilitates FZC transcrip- 
tional silencing and coordinates the switching be- 
tween chromatin states (5). VIN3 associates with a 
homologous PHD protein, VERNALIZATION 5 
(VRNS5), and a vernalization-specific Polycomb 
repressive complex 2 (PRC2) (6, 7), which accu- 
mulates at an intragenic nucleation region cover- 
ing the first exon and part of the first intron of FLC. 
Quantitative accumulation of H3K27me3 at the 
nucleation region during cold exposure, and over 
the whole locus after cold exposure, reflects a cell- 
autonomous epigenetic switch affecting an in- 
creasing proportion of cells (8). The recruitment 
of PHD-PRC2 to the nucleation region is, therefore, 
a key step in the silencing process. In Drosophila, 
Polycomb response elements (PRE) have been 
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identified as cis sites for PRC2 recruitment and 
provide sequence-specific “memory” modules for 
the activity of linked enhancers (9). In contrast, 
in mammals, CpG islands facilitate targeting of 
Polycomb machinery, with Polycomb complexes 
“sampling” chromatin to determine transcriptional 
states (10). In Arabidopsis, PRE-like elements have 
been identified (17), but whether they recruit 
Polycomb complexes has been unclear. We sought 
to determine what targets PHD-PRC2 to the nu- 
cleation region of FLC. 

A forward genetic screen for impaired FLC- 
LUCIFERASE (FLC-LUC) silencing (12) identified 
orn8s mutant (Fig. 1A and fig. S1, A to C). The pro- 
genitor plants showed characteristic cold-induced 
silencing of both the endogenous FLC and the 
FLC-LUC transgene (Fig. 1, B and C, and fig. S1A). 
In contrast, the FLC-LUC transgene expression re- 
mained high in vrn8 after cold exposure, whereas 
expression of the endogenous FLC was reduced as 
normal (Fig. 1, B and C, and fig. S1A). The different 
behavior of the two copies suggests that vrn8 does 
not encode a trans factor involved in vernalization. 

The vrn8 mutation was a cytosine-to-thymine 
change in intron 1 of the FLC-LUC transgene, at 
position +585 downstream of the transcriptional 
start site (hereafter, we term this mutation C5857; 
Fig. 1D and fig. SID). C585T maps to the first of a 
pair of RY cis elements (TGCATG, RY-1 and RY-2; 
R, purine; Y, pyrimidine), which are recognized 
by B3 DNA binding domains (73). Alignment of 
FLC intronic sequences from different species 
of Arabidopsis and Brassica shows 100% se- 


quence conservation of both RY motifs (Fig. 1D). 
To confirm the effect, we regenerated the C585T 
mutation and compared plants carrying wild-type 
(FLC-WT) and mutated (FLC-C5857T) transgenes 
(fig. S2A). Cold-induced repression of FZC was 
impaired in FLC-C585T transgenic lines (Fig. 1E 
and fig. S2B), and the plants flowered later (fig. 
$2C). The proximity of the C585T change to the 
PHD-PRC2 nucleation region prompted an analysis 
of cold-induced chromatin changes in FLC-C585T. 
The quantitative increase in H3K27me3 and equiv- 
alent decrease in H3K36me3 at FLC-WT (Fig. IF 
and fig. S2D) (74) were not found at FLC-C585T 
(Fig. 1G and fig. S2E), suggesting that the C585T 
mutation prevents PHD-PRC2 nucleation. 
Identification of the C585T mutation raised 
the question of what bound to the RY elements. 
Potential candidates included the B3 transcrip- 
tional regulators belonging to the LAV family (13): 
LEAFY COTYLEDON 2 (LEC2), ABSCISIC ACID 
INSENSITIVE 3 (ABI3), FUSCA3 (FUS3), and the 
VIVIPAROUS1/ABI3-LIKE factors (VALI, VAL2, and 
VAL3). The low levels of expression of ABI3, 
LEC2, FUS3, and VAL3 in 10-day-old seedlings 
(fig. S3, A and B) argued against a function of the 
corresponding transcriptional regulators in FLC 
silencing during vernalization. In contrast, VALI 
and VAL2 were expressed at higher levels than 
other LAV family genes in seedlings (fig. S3A) and 
continued to be expressed during vernalization 
(fig. S3, C and D). VAL proteins repress late seed 
maturation genes and promote the switch from 
embryonic to vegetative development. vall val2 
double-mutant seedlings express many embryonic- 
specific transcripts and also show synergistically 
increased expression of FLC compared with each 
single mutant alone (15). We crossed vali and 
val2 single mutants with the Columbia FRIGIDA 
(Col FRI) line to assess whether VAL genes are 
required for FLC regulation during vernalization. 
vall FRI mutants flowered later than Col FRI 
and val2 FRI plants (Fig. 2A and fig. S4A), and 
this was reflected in higher FLC expression levels 
before and during cold exposure (Fig. 2B). val 
FRI mutants also showed reduced sensitivity to 
vernalization. The cold-induced reduction in non- 
spliced FLC transcript (probably reflecting tran- 
scription; figs. S4B and S5A) and FLC mRNA (Fig. 
2C) was slower in vall FRI than in wild-type 
plants, but COOLAIR induction was unaffected 
(fig. S5B). The mutant phenotype was comple- 
mented by expression of a hemagglutinin (HA)- 
tagged VALI (fig. S6). VAL1 not only modulated 
FLC transcriptional shutdown but also appeared 
to influence the FLC homologs MADS AFFECTING 
FLOWERING 1 and 2 (MAFI and MAF2; fig. S4, 
C and D). Loss of VALI also attenuated the 
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Fig. 1. vrn8 disrupts an intronic RY element that 
is required for FLC silencing. (A) The vrn8 mutant 
fails to silence FLC-LUC after cold exposure. Lucifer- 
ase activity is depicted with false color from least 
(blue) to most intense (red). Ler, Landsberg erecta ac- 
cession. (B and C) Expression of endogenous Ler FLC 
(B) and FLC-LUC transgene (C). The data are abun- 
dances relative to UBIQUITIN-CONJUGATING ENZYME 21 
(UBC) and standardized to nonvernalized (NV) con- 
ditions. Numeral-W-numeral, number of weeks (W) 
of cold treatment followed by number of days of 
growth at 22°C. (D) Schematic representation of FLC 
genomic locus. Black boxes represent exons. The green 
dashed line represents the region analyzed in (F) and 
(G). Alignment of FLC intronic sequences from differ- 
ent species of Brassicaceae is shown below; RY motifs 
are in bold. The single nucleotide change in vrn& is 
indicated (C585T, red). TSS, transcriptional start site. 
(E) Spliced FLC expression in FLC-WT and FLC-C585T 
transgenic lines. (F and G) ChIP analysis of H3K27me3 
accumulation at the FLC locus in FLC-WT (F) and FLC- 
C585T (G) plants. Numbers on the x axes are distances 
to the TSS (TSS = 0). Throughout this figure, values are 
means + SEM of three biological replicates. **P < 0.01; 
*P < 0.05; ns, not significant. 


Fig. 2. VAL1 is a component of the vernalization 
mechanism. (A) Flowering time after 6 weeks of cold. 
Each gray triangle represents a single plant (n = 36); 
means (black horizontal lines) + SD (error bars) are 
shown. ***P < 0.001. (B) Spliced FLC expression in 
vall FRI, Col FRI, and val2 FRI plants before (NV) and 
during (6WO) cold exposure. (©) Dynamics of spliced 
FLC down-regulation during vernalization. Data in (B) 
and (C) are abundances, expressed as in Fig. 1. (D and E) 
H3K27me3 accumulation (from ChIP analysis) along 
the FLC locus in Col FRI (D) and vall FRI (E) plants. 
Numbers on the x axes are distances to the TSS (TSS = 
0). The schematic of the FLC locus is shown below 
each panel. Values in (B) to (E) are means + SEM of 
three biological replicates. 


cold-induced H3K27me3 accumulation at FLC: Start- 
ing levels before cold exposure were lower and failed 
to reach wild-type levels after 4 weeks of cold (Fig. 2, 
D and E). The nonreactivation of FLC after cold ex- 
posure in val] FRI (Fig. 2C and fig. S4B) implies that 
nucleation is defective (Fig. 2E) but that the com- 
ponents of the longer-term Polycomb memory are 
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not perturbed. The phenotype of vali FRI (Fig. 2E) 
was not as strong as that caused by C585T (Fig. 1G), 
which is consistent with VALI and VAL2 func- 
tioning redundantly in FLC regulation (15). 
ABI3, FUS3, and LEC2 bind in a sequence- 
specific manner to RY elements in vitro (16-18). 
This is also the case for the VALI B3 domain, which 
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binds in vitro to the FLC RY-1 element (Fig. 3A), 
with the C585T mutation sufficient to disrupt 
binding. Competition experiments showed spec- 
ificity of VAL1 B3 binding to the FLC RY motif 
(fig. S7, A and B) and not to a different B3 bind- 
ing cis element (RAV) (19). VALI B3 can also bind 
to the second RY site (RY-2) that occurs just 
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Fig. 3. VAL1 binds to the FLC nucleation region. 
(A) Electrophoretic mobility shift assay testing GST- 
VALIB3 binding to an RYcis element (GST, glutathione 
S-transferase). Wild-type (RY-1) or mutated [RY-1(C5857)] 
RY probes were combined with increasing amounts 
of GST-VAL1B3 (1 = 25 ng/ul). (B) ChiP—quantitative 
polymerase chain reaction of VAL1-HA binding along 
FLC. ACTIN2 (ACT), UBC, and SHOOT MERISTEMLESS 
(STM) were used as negative controls. The schematic 
of the FLC locus is shown below. Values are means + 
SEM of one biological replicate for samples from 
six independent transgenic lines. 


downstream of RY-1; competition experiments 
revealed that mutation of both RY elements is 
required to block VALI B3 binding to the nu- 
cleation region in vitro (fig. S7C). Consistent 
with this, in vivo mutation of RY-2 also atten- 
uated FLC silencing (fig. S7D), suggesting that 
both RY elements contribute functionally. Chro- 
matin immunoprecipitation (ChIP) confirmed in 
vivo binding of VALI-HA to the FLC nucleation 
region during cold exposure (Fig. 3B). These data 
raise interesting parallels with the cooperative 
binding of auxin response factors (ARFs) to tan- 
dem binding sites in vivo (20). 

The association of VALI with the nucleation 
region as a prerequisite for PHD-PRC2 activity 
at the locus raised two questions. First, how does 
VALI binding within intron 1 repress FLC tran- 
scription? Second, what is the link between VAL1 
and PHD-PRC2? Affinity purification of HA- and 
green fluorescent protein (GFP)-tagged VALI from 
Arabidopsis seedlings revealed VAL1 in vivo inter- 
actors: the PRC1 RING finger homolog AtBMIIA, 
the co-repressor SIN3-associated protein SAP18 
(AtSAP18), and its two partners, the RNA-binding 
protein SR45 and the SAP-domain protein ACINUS 
(Fig. 4A, fig. S6D, and tables S1 and S2) (27). 
AtBMIIA has previously been found to interact 
with VALI to repress seed maturation genes 
through H2A lysine 121 ubiquitination (H2Aub) 
(22). We therefore tested accumulation of H2Aub 
at FLC during vernalization. Although H2Aub 
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Fig. 4. VAL1 nucleates silencing at the FLC locus. (A) List of proteins identified by VAL1-HA (IP1 and 
IP2) and GFP-VAL1 (IP3 and IP4) affinity purification (IP, immunoprecipitation). (B and C) ChIP analysis of 
histone H3 acetylation (acetylH3) along the FLC locus in Col FRI (B) and vall FRI (C). Numbers on the x axes 
are distances to the TSS (TSS = 0) and correspond to the schematic below each panel. (D) Days to flower 
after 12 weeks of vernalization for F2 plants from vall FRI crosses with vin3 FRI and vrn2 FRI (table S6). WT 
indicates at least one functional allele for VALI, VRN2, and VIN3. Lowercase indicates that both alleles are 
nonfunctional (vail, vin3, and vrn2). All F2 individuals are in FRI background. DNF, did not flower. (E) Schematic 
depicting sequence-specific binding of VAL proteins to the FLC nucleation region, targeting ASAP, HDA19, 
and potentially PRC1 activities to shut down transcription and thereby enable PHD-PRC2 nucleation. After 
this process, H3K27me3 covers the FLC locus to help maintain epigenetic silencing. Small double arrows 


indicate protein interactions identified in this work. 


was detected at FLC after 4 weeks of cold, there 
was no difference between Col FRI and val] FRI 
(fig. S8). However, H2Aub is not always required 
for efficient PRC1 repression (23), which can di- 
rectly repress genes by triggering chromatin com- 
paction (24). 

SAP18 is a component of the SIN3-histone 
deacetylase complex (HDAC) that in humans is 
required to enhance SIN3-mediated repression of 
transcription (25). VALI contains a plant-specific 
ethylene-responsive element binding factor-associated 
amphiphilic repression (EAR) motif that can phys- 
ically interact with AtSAP18 to mediate histone 
deacetylase 19 (HDA19) recruitment (26). We hy- 
pothesized that VALI could bring HDA19 activity 
to the FLC locus. H3 acetylation is reduced at the 
FLC nucleation region during cold exposure (Fig. 
4B) (27); this reduction was blocked in vali FRI 


(Fig. 4C), as has been observed in vin3 (27). SAP18 
has also been linked to RNA processing and deg- 
radation (28) and is a subunit of the conserved 
apoptosis- and splicing-associated protein (ASAP) 
complex, together with SR45 and ACINUS (27). Af 
finity purification of GFP-AtSAP18 from Arabidopsis 
confirmed association with AtSR45 and ACINUS 
(table S3). This suggests that the conserved func- 
tion of ASAP and HDAI9 is required for VAL1- 
mediated FLC silencing. Accordingly, hdal9 and 
sr45 mutants, and to a lesser extent sap18 mutants, 
showed FLC up-regulation (fig. S9). Importantly, 
AtSAP18, AtSR45, and HDA19 were immunopuri- 
fied with both VRN5-GFP (table S4) and VIN3-GFP 
(table S5), linking VAL1 association with a specific 
DNA sequence to Polycomb silencing. Combina- 
tion of vall FRI with mutants with loss of func- 
tion in VRN2 (vrn2 FRI) and VIN3 (vin3 FRI 
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confirmed the close functionality of VALI and 
PHD-PRC2. vall vin3 and vall vrn2 plants were 
very delayed in flowering (Fig. 4D; table S6; and 
fig. S10, A to D) after 12 and 18 weeks of cold, and 
FLC expression remained high (fig. S10, E and F). 
This synergistic interaction is best explained by 
VALI-VAL2 redundancy, with both proteins func- 
tioning through PHD-PRC2. 

We propose that VAL] binds sequence-specifically, 
probably as a homodimer, but potentially as a 
heterodimer with VAL2, to the RY motifs in the 
FLC nucleation region (Fig. 4E). This recruits the 
ASAP complex and potentially PRC1, resulting in 
the shutdown of transcription and reduced his- 
tone acetylation. In turn, these activities allow 
PHD-PRC2 nucleation and long-term epigenetic 
silencing of the locus. We cannot exclude the 
possibility of effects of VAL1 and VAL2 on FLC 
expression that are independent of the binding 
to FLC intron 1. The association of HDA19 with 
the PHD proteins suggests that it has multiple 
roles in the process, potentially interacting with 
VALI through SAP18 (26) for transcriptional 
repression and with VIN3 (and VRNS) to facilitate 
+1 nucleosome stabilization (27). It will now be 
important to investigate which components con- 
fer the switchlike on-off property that has been 
proposed for the nucleation event (29). 
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SYMBIOSIS 


Basidiomycete yeasts in the cortex of 
ascomycete macrolichens 
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For over 140 years, lichens have been regarded as a symbiosis between a single fungus, 
usually an ascomycete, and a photosynthesizing partner. Other fungi have long been 
known to occur as occasional parasites or endophytes, but the one lichen—one fungus 
paradigm has seldom been questioned. Here we show that many common lichens are 
composed of the known ascomycete, the photosynthesizing partner, and, unexpectedly, 
specific basidiomycete yeasts. These yeasts are embedded in the cortex, and their 
abundance correlates with previously unexplained variations in phenotype. Basidiomycete 
lineages maintain close associations with specific lichen species over large geographical 
distances and have been found on six continents. The structurally important lichen 
cortex, long treated as a zone of differentiated ascomycete cells, appears to consistently 


contain two unrelated fungi. 


ost definitions of the lichen symbiosis em- 
phasize its dual nature: the mutualism 
of a single fungus and single photosyn- 
thesizing symbiont, occasionally supple- 
mented by a second photosynthesizing 
symbiont in modified structures (J-4). Together, 
these organisms form stratified, often leafy or 
shrubby body plans (thalli) that resemble none 
of the symbionts in isolation, a feature thought 
to be unique among symbioses (7). Attempts to 
synthesize lichen thalli from the accepted two 
components in axenic conditions, however, have 
seldom produced structures that resemble natural 


‘Institute of Plant Sciences, NAWI Graz, University of Graz, 
8010 Graz, Austria. “Division of Biological Sciences, 
University of Montana, Missoula, MT 59812, USA. 
Department of Organismal Biology, Uppsala University, 
Norbyvagen 18D, 752 36 Uppsala, Sweden. “Department of 
Ecology, Swedish University of Agricultural Sciences, Post 
Office Box 7044, SE-75007 Uppsala, Sweden. “Institute of 
Molecular Biosciences, BioTechMed-Graz, University of 
Graz, 8010 Graz, Austria. (Department of Botany and 
Plant Pathology, Purdue University, West Lafayette, IN 
47907, USA. ’Program in Integrated Microbial Biodiversity, 
Canadian Institute for Advanced Research, Toronto, 
Ontario, Canada. 
*Corresponding author. Email: toby.spribille@mso.umt.edu 
{Present address: Institute of Biodiversity, Animal Health and 
Comparative Medicine, College of Medical, Veterinary and Life 
Sciences, University of Glasgow, Glasgow G12 8QQ, UK. Present 
address: Plant Health and Environmental Laboratory, Ministry for 
Primary Industries, Auckland, New Zealand. 


thalli (5, 6). Notably, a critical structural feature of 
stratified lichens, the cortex, typically remains 
rudimentary in laboratory-generated symbioses 
(5). Recently, it has been suggested that micro- 
bial players, especially bacteria, may play a role 
in forming complete, functioning lichen thalli (7). 
However, although culturing and amplicon se- 
quencing have revealed rich communities of 
microbes (7, 8), including other fungi (8-0), no 
new stably associated symbiotic partners have 
been found. 

The recalcitrance of lichens to form thalli 
in vitro means that characterizing symbiont gene 
activity (e.g., through transcriptomics) requires 
an approach that works with natural thalli. We 
used metatranscriptomics to better understand 
the factors involved in forming two macrolichen 
symbioses, Bryoria fremontii and B. tortuosa. 
These two species have been distinguished for 
90 years by the thallus-wide production of the 
toxic substance vulpinic acid in B. tortwosa, causing 
it to appear yellowish, in contrast to B. fremontii, 
which is dark brown (12). Recent phylogenetic 
analyses have failed to detect any fixed sequence 
differences between the two species in either the 
mycobiont (Ascomycota, Lecanoromycetes, Bryoria) 
or the photobiont (Viridiplantae, Trebouxia 
simplex) when considering four and two loci, 
respectively (12, 13). We hypothesized that differ- 
ential gene expression might account for the 
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increased production of vulpinic acid in B. 
tortuosa. 

We first selected 15 thalli (six from B. fremontit 
and nine from B. tortuosa, all free from visible 
parasitic infection) from sites across western 
Montana, USA, for mRNA transcriptome se- 
quencing. An initial transcriptome-wide analysis 
of single-nucleotide polymorphisms (SNPs) for 
Ascomycota and Viridiplantae transcript subsets 
showed no correlation between genotype and 
phenotype in B. fremontii and B. tortuosa, con- 
firming previous results (72, 13) (Fig. 1, A and B). 
Next, we estimated transcript abundances by map- 
ping raw reads back to a single, pooled meta- 
transcriptome assembly and binning by taxon. 
Restricting our analyses to Ascomycota and Viridi- 
plantae revealed little differential transcript abun- 
dance associated with phenotype (Fig. 1, C and E). 
Taken together, these analyses confirm previous 
conclusions that the two lichen species are nomen- 
clatural synonyms (2) but still provide no expla- 
nation for the underlying phenotypes (which we 
shall continue to refer to by their species names for 
convenience). However, by expanding the taxo- 
nomic range to consider all Fungi, we found 
506 contigs with significantly higher abundances 
in vulpinic acid-rich B. tortuosa thalli. A majority 


Ascomycota 


5.01 «ps0.05 


2.5- 


Fungus 


-2.5. os ‘ 


of these contigs were annotated as Basidiomycota 
(Fig. 1D). These data suggested that a previously 
unrecognized basidiomycete was present in thalli 
of both species but was more abundant whenever 
vulpinic acid was present in large amounts. 

We next sought to determine whether this 
uncharacterized basidiomycete was specific to 
the studied Bryoria species or could be found 
in other lichens. From metatranscriptome con- 
tigs containing ribosomal RNA (rRNA) basidio- 
mycete sequences, we designed specific primers 
for ribosomal DNA [rDNA; 18S, internal tran- 
scribed spacer (ITS), and D1D2 domains of 28S) 
to screen lichens growing physically adjacent 
to Bryoria in Montana forests. Each assayed 
lichen species carried a genetically distinct strain 
of the basidiomycete, indicating a high degree of 
specificity. Furthermore, we found that Letharia 
vulpina, a common lichen species growing in- 
termixed with Bryoria, maintained basidio- 
mycete genotypes that were distinct from those in 
Bryoria, not only in Montana but also in north- 
ern Europe (fig. S1). When assaying for the basi- 
diomycete across the seven main radiations of 
macrolichens in the class Lecanoromycetes, we 
found related basidiomycete lineages associ- 
ated with 52 lichen genera from six continents, 
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including in 42 of 56 sampled genera of the family 
Parmeliaceae (fig. S2). As a whole, these data in- 
dicate that basidiomycete fungi are ubiquitous 
and global associates of the world’s most speciose 
radiation (14) of macrolichens. 

To place the basidiomycete lineages in a 
phylogenetic context, we generated a 349-locus 
phylogenomic tree by using gene sequences in- 
ferred from our transcriptome data set and other 
available genomes (table S1). This analysis placed 
the basidiomycete as sister to Cystobasidium 
minutum (class Cystobasidiomycetes, subphylum 
Pucciniomycotina) with high support (Fig. 2A). The 
only previously known lichen-associated mem- 
bers of Cystobasidiomycetes are two species of 
Cyphobasidium, which is hypothesized to cause 
galls on species of Parmeliaceae (75). When in- 
corporated into a broader sample of published 
cystobasidiomycete rDNA sequence data (16-18), 
the majority of our lichen-derived sequences form 
a strongly supported monophyletic clade with 
Cyphobasidium (Fig. 2B). Using current classifi- 
cation criteria (18), the lichen-associated line- 
ages would include numerous new family-level 
lineages, and we recognize this set of subclades 
as the new order Cyphobasidiales (19). Applying 
a relaxed molecular clock to our phylogenomic tree 


“0.05 
c Alga 


5.0. «ps0.05 


2.5- 


5 
logCPM 


5 
logCPM 


5 10 15 
logCPM 


Fig. 1. Genome-wide divergence and transcript abundance of fungi and 
algae, based on symbiont subsets extracted from wild Bryoria metatran- 
scriptomes. (A and B) Unrooted maximum likelihood topologies for (A) the 
Ascomycota member (lecanoromycete) and (B) the Viridiplantae member 
(alga) within the lichen pair B. fremontii and B. tortuosa, based on 30,001 and 
25,/88 SNPs, respectively. Numbers refer to metatranscriptome sample IDs 
(table S2). Scale bars indicate the average number of substitutions per site 
(C to E) Logarithm of the fold change (logFC) between vulpinic acid—deficient 
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(B. fremontii) and vulpinic acid-rich (B. tortuosa) phenotypes in 15 Bryoria 
metatranscriptomes, plotted against transcript abundance (logCPM, loga- 
rithm of counts per million reads). Only transcripts found in all 15 samples 
were included. Ascomycota transcripts only are shown in (C). All fungal tran- 
scripts are shown in (D), with taxonomic assignments superimposed; a plot 
with statistically significant transcript differential abundance is shown as an 
inset. Viridiplantae transcripts are shown in (E). Red dots indicate a log fold 
change with P < 0.05 in (C), the inset of (D), and (E) (highlighted with arrows). 
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Fig. 2. Placement of Cyphobasidiales members and their diversity with- 
in lichens. (A) Maximum likelihood phylogenomic tree based on 39 fungal 
proteomes and 349 single-copy orthologous loci. Dating based on a 58-locus 
subsample shows relative splits between Cyphobasidiales and Cystobasidium 
minutum and splits leading to the lecanoromycete genera Xanthoria, Cladonia, 
and Bryoria (colored bars indicate 95% confidence intervals; fungi occurring in 


(Fig. 2A) shows the Cystobasidium-Cyphobasidium 
split occurring around the same time as the origin 
of three of the main groups of lecanoromycete 
macrolichens in which Cyphobasidiales species 
occur, suggesting a long, shared evolutionary 
history. Two fossil calibrations place this split at 
around 200 million years before the present (figs. 
S4 and S5). 

Our initial microscopic imaging failed to reveal 
any cells that we could assign to Basidiomycetes 
with certainty. Furthermore, attempts to culture 
the basidiomycete from fresh thalli were unsuc- 
cessful. We therefore developed protocols for 
fluorescent in situ hybridization (FISH) target- 
ing specific ascomycete and cystobasidiomycete 
rRNA sequences. Cystobasidiomycete-specific 
FISH probes unambiguously hybridized round, 
~3- to 4-,1m-diameter cells embedded in the pe- 
ripheral cortex of both B. fremontti and B. tortwosa 
(Fig. 3 and movie S1). Consistent with the tran- 
script abundance data, these cells were more 
abundant in thalli of B. tortuosa (Fig. 3), where 
they were embedded in secondary metabolite res- 
idues (movie S1). Imaging of other lichen species 
likewise revealed cells of similar morphology in 
the peripheral cortex (fig. S6). Composite three- 
dimensional FISH images from B. capillaris show 
the cells occurring in a zone exterior to the le- 
canoromycete (Fig. 4 and movie $2) and em- 
bedded in polysaccharides (Fig. 4C), explaining 
why these cells are not observed in scanning 
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electron microscopy (Fig. 4A). In some species, 
such as L. vulpina, the abundance of hybridized 
living cells was low, but selective removal of the 
polysaccharide layer through washing revealed 
high densities of collapsed, apparently dead cells 
within the cortex (fig. $7). These dead cells may 
explain the paucity of the FISH signal in some 
experiments. The mononucleate single cells (fig. 
S8C), evidence of budding, and absence of hyphae 
or clamp connections are consistent with an ana- 
morphic or yeast state in Cystobasidiomycetes. 
FISH imaging of Cyphobasidium galls on the 
lichen Hypogymnia physodes, obtained from 
Norway, confirmed the link to the sexual or teleo- 
morphic state (fig. S8), which appears to develop 
rarely (15). These data suggest that the gall- 
inducing form of Cyphobasidium completes its 
life cycle entirely within lichens. 

It is remarkable that Cyphobasidium yeasts 
have evaded detection in lichens until now, 
despite decades of molecular and microscopic 
studies of the Parmeliaceae (20-22). It seems 
likely that the failure to detect Cyphobasidium 
in both Sanger and amplicon sequencing studies 
(8) is due to multitemplate polymerase chain 
reaction bias. The most widespread clade of 
Cyphobasidium possesses a 595-base pair group 
I intron inserted downstream of the primer bind- 
ing site ITSIF, doubling the template length of 
ITS, a popular fungal barcode (23). This, com- 
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trees are shown in fig. S3. 
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lichens are shown in green). (B) Maximum likelihood rDNA phylogeny of the 


tes, with images of representative lichen species 


from which sequences were obtained in each clade; thick branches indicate 
bootstrap support >70%. Shaded triangles are scaled to the earliest branch 


ence divergence in each clade. Full versions of the 


a template below detection thresholds (24). Also, 
we cannot rule out that Cyphobasidium yeasts 
have actually been sequenced and discarded as 
presumed contaminants. 

The lichen cortex layer has long been con- 
sidered to be key for structural stabilization of 
macrolichens, as well as for water and nutrient 
transfer into the thallus interior (6, 25). Most 
macrolichens possess a basic two-layer cortex 
scheme consisting of conglutinated internal hyphae 
and a thin, polysaccharide-rich peripheral layer 
(25). However, the internal cellular structure is 
not uniform across lichens (26), and the com- 
position of extracellular polysaccharides is poor- 
ly known (27). In Bryoria, the layer in which 
Cyphobasidium yeasts occur has not been recog- 
nized as distinct from the cortex (11), although in 
other parmelioid lichens, a seemingly homolo- 
gous layer has sometimes been referred to as 
the “epicortex” (20). The discovery of ubiquitous 
yeasts embedded in the cortex raises the pros- 
pect that more than one fungus may be involved 
in its construction, and it could explain why lichens 
synthesized in vitro from axenically grown asco- 
mycete and algal cultures develop only rudimentary 
cortex layers (5). 

In many lichens, the peripheral cortex layer in 
which Cyphobasidium yeasts are embedded is 
enriched with specific secondary metabolites 
(25), the production of which often does not cor- 


bined with low background abundance, can push 


relate with the lecanoromycete phylogeny (28). 
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Bryoria fremontii 


Bryoria tortuosa 


Fig. 3. Differential abundance of Cyphobasidiales yeasts in B. fremontii 
and B. tortuosa. (A) B. fremontii, with (B) few FlSH-hybridized live yeast 
cells at the level of the cortex. (C) B. tortuosa, with (D) abundant FISH- 
hybridized cortical yeast cells (scale bars, 20 um). 


The assumption that these substances are exclu- 
sively synthesized by the lecanoromycete must 
now be considered untested. In B. tortuosa, dif- 
ferential transcript and cell abundance data, along 
with physical adjacency to crystalline residues, 
implicate Cyphobasidium in the production of 
vulpinic acid, either directly or by inducing its 
synthesis by the lecanoromycete. Confirming a 
link by using transcriptome or genome data is 
impossible until the enzymatic synthesis pathway 
of vulpinic acid is described. However, related 
pulvinic acid derivatives are synthesized by other 
members of Basidiomycota (29). 

The assumption that stratified lichens are con- 
structed by a single fungus with differentiated 
cell types is so central to the definition of the 
lichen symbiosis that it has been codified into 
lichen nomenclature (30). This definition has 
brought order to the field but may also have 
constrained it by forcing untested assumptions 
about the true nature of the symbiosis. We sug- 
gest that the discovery of Cyphobasidium yeasts 
should change expectations about the potential 
diversity and ubiquity of organisms involved in 
one of the oldest known and most recognizable 
symbioses in science. 
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INFECTION 


Increased plasmid copy number is 
essential for Yersinia T3SS function 


and virulence 


He Wang,”” Kemal Avican,”’” Anna Fahlgren,' Saskia F. Erttmann,”” Aaron M. Nuss,” 
Petra Dersch,* Maria Fallman,"” Tomas Edgren,'*+ Hans Wolf-Watz??*+ 


Pathogenic bacteria have evolved numerous virulence mechanisms that are essential for 
establishing infections. The enterobacterium Yersinia uses a type III secretion system 
(T3SS) encoded by a 70-kilobase, low-copy, IncFll-class virulence plasmid. We report a 
novel virulence strategy in Y. pseudotuberculosis in which this pathogen up-regulates the 
plasmid copy number during infection. We found that an increased dose of plasmid- 
encoded genes is indispensable for virulence and substantially elevates the expression and 
function of the T3SS. Remarkably, we observed direct, tight coupling between plasmid 
replication and T3SS function. This regulatory pathway provides a framework for further 
exploration of the environmental sensing mechanisms of pathogenic bacteria. 


hree human pathogenic Yersinia strains—Y. 

pestis, Y. enterocolitica, and Y. pseudotuberculosis 

—share a common 70-kb virulence plasmid 

(IncFII) that encodes a set of virulence 

proteins called Yops (Yersinia outer pro- 
teins) (/, 2). These bacterial toxins are secreted by 
a plasmid-encoded organelle, the ysc/yop type 
III secretion system (T3SS) (3-5). The T3SS com- 
prises ~20 proteins that span the inner and outer 
bacterial membranes (6, 7). Upon contact with 
eukaryotic cells, Yersinia deploys the T3SS to 
translocate Yops into the cytoplasm of target 
cells via a translocon formed in the cell mem- 
brane (8, 9). This process is strictly regulated, 
and Yop expression and secretion increase after 
the bacterium establishes intimate contact with 
the eukaryotic target cell (70). This cell contact- 
dependent regulation can be mimicked in vitro 
at 37°C by depleting Ca?* from the growth me- 
dium (11). 

Incubation of Yersinia at 37°C in Ca?*-deficient 
medium leads to T3SS induction and growth re- 
striction after approximately two generations (4). 
Growth arrest may be due to the metabolic bur- 
den caused by excess expression of plasmid- 
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encoded T3SS proteins (72). Thus, the function of 
the T3SS is paradoxical, because conditions that 
promote Yop secretion result in bacterial growth 
restriction. This feature is incompatible with in- 
fection; consequently, we reasoned that Yersinia 
must have evolved a mechanism to circumvent 


this problem. We observed that increased amounts 
of virulence plasmid DNA were recovered from 
wild-type Y. pseudotuberculosis cells grown under 
T3SS-inductive conditions relative to bacteria grown 
under T3SS-repressive conditions (fig. SIB). There- 
fore, we hypothesized that Yersinia may undergo 
rapid changes in gene expression by increasing 
and decreasing virulence plasmid copy numbers. 
Rapid changes in gene dose could adjust the 
T3SS output to trade off virulence costs and the 
pathogen’s metabolic capacity to optimize growth. 

To explore a potential connection between 
virulence and plasmid copy number, we first 
determined plasmid (pIBX) copy numbers in 
Y. pseudotuberculosis YpIII (YpIII/pIBX) (13) 
cultures under different conditions with a poly- 
merase chain reaction (PCR)-free whole-genome 
sequencing approach (/4). The depth of coverage 
(the number of times a nucleotide is read during 
the sequencing process) reflects the concentra- 
tions of chromosomal DNA and plasmid DNA 
molecules in the sample. We found that the pIBX 
copy number increased from ~1 to ~3 per chromo- 
somal equivalent when conditions were repressive 
(26°C) or inductive (37°C, Ca?*-free), respectively, 
to T3SS activity (Fig. 1A). At 37°C in the presence 
of Ca”* (T3SS-repressive conditions), the copy num- 
ber increased only modestly (1.6 per chromo- 
somal equivalent). Similar differences in plasmid 


5 A B C D 
< 7 2 . Preerss : ns h Sy re 
S 4 4 J 4 wt _copA 
£ 3 + - + - Ca 
——-—YopD 

S 3 3 3-4 eer 
a 2. 2 24 
z ns 
7 ' 1 1 i4 4 
s | | al 

TruSeq “Time (h) AlerF AyopD wt wt AyopD 


pyopE-copA 


Fig. 1. Y. pseudotuberculosis differentially regulates virulence plasmid copy number in vitro. 
A) Virulence plasmid copy numbers in DNA isolated from Y. pseudotuberculosis Yplll/pIBX grown 
for 3 hours under different conditions [white, 26°C; gray, 37°C under T3SS-repressive conditions 
Ca**): black, 37°C under T3SS-inductive conditions] as determined by whole-genome sequencing 
TruSeq). Copy number was calculated as the ratio of average depth of plasmid DNA coverage to 
chromosomal DNA coverage (N = 2). (B) Time course of plasmid copy number increase determined by 
qPCR. At time zero, Y. pseudotuberculosis cultures were shifted from 26° to 37°C under T3SS-repressive 
gray) and T3SS-inductive (black) conditions. Plasmid copy number is defined as the number of plasmid 
equivalents per chromosome (N = 6). (C) Plasmid copy number changes in Yplll/pIB73 (AlcrF) and 
Yplll/pIB621 (AyopD) determined by qPCR after 3 hours of growth under the same conditions as in (A) 
N = 6). (D) qPCR results showing plasmid copy numbers in Yplll/pIBX (wt) and a Yplll/pIB621 (AyopD) 
overexpressing the antisense copA RNA fused to the yopE promoter in cis (pyopE-copA) after 3 hours of 
growth at 37°C under T3SS-repressive (gray) and T3SS-inductive (black) conditions. Copy numbers in 
Yplll/pIBX (wt) are shown as control (N = 4). Inset: Immunoblot of whole-cell lysates from the indicated 
strains probed with antibodies to Yops. Data are means + SEM (*P < 0.05, **P < 0.01, ***P < 0.001, 
Mann-Whitney U test; ns, not significant). 
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copy numbers were found with quantitative real- 
time PCR (qPCR) (fig. SIA) and Southern blot 
analysis (fig. S1B). 

We used qPCR to determine the time course 
of pIBX copy number variations. We found that 
the copy number increased and plateaued 1 hour 
after the temperature shift to T3SS-inductive 
conditions (Fig. 1B). Bacteria grown at 37°C under 
T3SS-repressive conditions plateaued at ~2 chro- 
mosome equivalents (Fig. 1B). These results con- 
firmed that the plasmid copy number increased 
in Y. pseudotuberculosis under T3SS-inductive 
conditions. 

To analyze the control of plasmid copy number 
in Yersinia, we investigated copy number changes 
in T3SS regulatory mutants. Earlier studies have 
shown that both Ca?* and temperature regulate 
the T3SS in Yersinia (15, 16). The T3SS transcrip- 
tional activator LcrF activates transcription of 


A Conjugation 


the T3SS regulon in response to elevated tem- 
perature (37°C) and low Ca?*; IerF mutants 
lack Yop expression and growth is not restricted 
in Ca’*-depleted medium at 37°C (16). In con- 
trast, yopD mutants display Ca?*-insensitive Yop 
expression and restricted growth at 37°C, ir- 
respective of Ca?* concentration (17). Thus, YopD 
is involved in a negative-feedback regulatory path- 
way controlling T3SS expression (18). We found 
that plasmid copy number control was not af- 
fected in a AlcrF mutant (YpIII/pIB73) (Fig. 1C). 
In contrast, in a AyopD mutant (YpIII/pIB621), 
the plasmid copy number was elevated at 37°C 
under T3SS-repressive conditions (gray bars, Fig. 
1C). Thus, YopD directly or indirectly inhibits a 
default temperature-regulated plasmid copy num- 
ber increase in Yersinia. 

The plasmid replication initiation protein RepA 
initiates IncFII plasmid replication. The level of 


Chromosomal integration by SacB counter-selection 


pNQAoriR-sacB 
cD} AB} 


QLOYOQOeOY©M OPO 


et Yplll chromosome 


Replicating pIBX:pNQ 


intermediate 
Amplification 
chromosome pIBX (IncFil) pNQ chromosome [Cm] 
p——_$ a 
— (ug/ml) 
c’D| AB} 25 
Koss (2 x 10") A Ke ipiicstica (1 0°) 
150 
Koes (2 x 10") AN A scsiineaion (1 0") 
300 
n 
C sYpill/piBx Ypill:(cIBX),_,Yplll:(cIBX),_, 


Yplll chromosome (Mb 
0: E sug) 


3.95 4.00 4.05 4.20 


4.25 


0123 01 2 3 0 12 3Time(h) 


Yplll:(cIBX),_, v 


| 


Yplll:(cIBX),_, 


nN Led 
o Coverage %. Coverage & og 


Secretion Expression 


Fig. 2. Integration of the pIBX (IncFII-) amplicon into the chromosome. (A) Homologous re- 


combination of pNQAoriR1-sacB (Cm‘) into pIBX 
intermediate (Cm', Km’, sacB*). SacB counterse 


Km‘) results in a replicating pIBX:pNQAoriR1-sacB 
ection in the presence of Km and Cm, selects for 


sacB clones where the pIBX:pNQ plasmid (Cm", Km‘) has been integrated into the chromosome. 
The final Cm", Km‘, sacB™ strain results from the deletion of 3426 bp encoding the native pIBX IncFll 


replicon and sacB (gray). Dashed lines indicate 


homologous recombination events. The integrated 


construct, flanked by duplicated YPK_3687 sequences (blue arrows), can be amplified by selecting 
for gene dose-dependent increase in Cm’ conferred by the cat gene (orange box). Black arrows and 


corresponding numbers denote the experimenta 


ly derived frequencies of indicated genetic events. 


(B) Number of reads of pIBX and chromosome sequences determined by whole-genome DNA se- 
quencing. Red, pIBX sequences; blue, chromosomal sequences. Gap in the plasmid sequence align- 
ments shows the 3426-bp deletion of the IncFll replicon. Arrowhead indicates increased YPK_3687 
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RepA is controlled at the translational level by the 
antisense RNA CopA (19). To investigate whether 
the IncFII replicon per se is responsible for in- 
creased virulence plasmid copy number, we ex- 
pressed Yersinia copA (20) under control of the 
yopE promoter. This should increase CopA levels 
and consequently reduce the rate of initiation of 
virulence plasmid replication under T3SS-inductive 
conditions. This construct was integrated in cis 
into the virulence plasmids of wild-type Yersinia 
and a AyopD mutant. CopA overexpression resulted 
in reduced plasmid copy numbers under T3SS- 
inductive conditions in both strains (Fig. 1D). 
This shows that the IncFII replicon, per se, is 
essential for the T3SS-regulated copy number 
increase. Furthermore, the reduced plasmid copy 
number in the CopA-overexpressing strains sup- 
pressed T3SS-related growth defects and reduced 
Yop expression under T3SS-inductive conditions 
(Fig. 1D, inset, and fig. S2). This indicates that 
Yersinia has evolved a T3SS-dependent mecha- 
nism to down-regulate the copy number of its 
virulence plasmid, and that this mechanism is 
likely to counteract the metabolic burden asso- 
ciated with induction of the T3SS. 

To investigate whether increased plasmid copy 
number is involved in virulence, we designed a 
mutant strain that could not change virulence 
plasmid copy number during infection. This was 
achieved by inserting pIBX, without the replica- 
tion function (IncFIT ), into the chromosome of 
Y. pseudotuberculosis (Fig. 2A). The final mu- 
tant, YpIII:(cIBX),,-1, contained one copy of the 
replication-deficient pIBX inserted into the chro- 
mosomal gene YPK_3687 (Fig. 2A). YPK_3687 was 
previously shown to be redundant for Y. psewdo- 
tuberculosis virulence (21). Insertion of a single 
copy of the replication-deficient pIBX plasmid, 
(cIBX),,-1, into YPK_3687 was verified by DNA 
sequencing and qPCR (Fig. 2B and fig. S3). As 
expected, the mutant strain was unable to change 
the gene dose of plasmid-encoded genes in re- 
sponse to T3SS activity (fig. S3). Secretion analysis 
showed that YpIII:(cIBX),,_, retained temperature- 
and Ca?*-dependent regulation of Yop expres- 
sion and secretion, but Yop levels were severely 
reduced by comparison with the wild type (Fig. 
2C). Yop translocation experiments showed that 
YpIII:(cIBX),,_; could translocate the reporter 
protein YopH,.99-Bla (22) into HeLa cells in a 
dose-dependent manner, but less than the pa- 
rental YPIII/pIBX wild-type strain (Fig. 3A). 
The YpIII:(cIBX),,-; mutant was severely attenu- 
ated in an oral BALB/c mouse infection model. 
YpIII:(cIBX),,-; initially colonized Peyer’s patches 
(PP), cecum, and mesenteric lymph nodes (MLNs; 
Fig. 3B and fig. S4) but all infected mice survived 
and recovered their initial body weight (Fig. 3B, 
inset), showing that the YpIII:(cIBX),,_, strain 
is avirulent. 

To confirm whether the reduced virulence of 
YpIII:(cIBX),,-; was a result of a reduced gene 
dose of virulence plasmid-encoded genes, we 
amplified the integrated virulence plasmid in 
the mutant. This was achieved by selecting for 
a gene dose-dependent increase in chloram- 
phenicol resistance conferred by the cat gene 
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Fig. 3. Increased gene dose of plasmid-encoded genes is essential for Yersinia virulence. 
(A) Translocation of YopH;-99-Bla into HeLa cells after 90 min. Histograms represent mean + SEM 
of a representative experiment. Translocation-deficient AyopBD mutant (Yplll/pIB219) expressing 
YopH;-99-Bla is included as a negative control. Response ratio: blue/green ratio divided by back- 
ground ratio. Lower panel shows representative micrographs of cells infected at a multiplicity of 
infection (MOI) of 50. (B) In vivo images showing anesthetized mice infected with Yplll:(cIBX),=1 
(left) or Yplll:(cIBX),=3 (right). Colors indicate levels of light emitted by bioluminescent bacteria. 
Inset: Changes in average body weight of infected mice over time [open symbols, Yplll:(cIBX),,=1; 
solid symbols, Yplll:(clIBX),=3]. (©) Virulence plasmid gene dose measured in bacterial colonies 
isolated from Peyer’s patches (PP), mesenteric lymph node (MLN), and cecum from infected mice 
2 days after infection. Each point represents the virulence plasmid copy number per chromosome 
of a single colony determined by qPCR [open symbols, Yplll:(cIBX),=1; solid symbols, Yplll:(cIBX),=3]. 


Data are means + SEM (***P < 0.001, Mann-Whitney U test). 


encoded by the integrated construct (Fig. 2A) 
(23, 24). Amplification of the cat gene was pos- 
sible through homologous recombination of the 
duplicated [831 base pairs (bp)] YPK_3687 se- 
quences flanking the integrated IncFII pIBX 
plasmid (Fig. 2A). When YplIlII:(cIBX),,_; was 
plated on to agar plates with different chloram- 
phenicol concentrations, we recovered clones 
resistant to chloramphenicol (150 ug/ml) at a fre- 
quency of 1 x 10°° and found duplication of the 
integrated virulence plasmid. After initial duplica- 
tion, further amplification was achieved after se- 
lection on plates with chloramphenicol (300 ug/ml) 
at a frequency of 1 x 10°“. As expected, in the 
absence of selective pressure (23), the amplified 
virulence plasmid-containing clones were unstable 
and reverted to the original single-copy genotype 
at a frequency of 2 x 10’. These revertants showed 
the same phenotype as the original YpIII:(cIBX),,-1 
variant. 

A clone with three copies of the plasmid 
[YpIII:(cIBX),,-3] was selected for further analysis 
of T3SS function and virulence. We verified the 
high- and low-copy genotypes of YpIII:(cIBX),,-1 
and YpIll:(cIBX),,-3, respectively, by whole-genome 
sequencing and qPCR (Fig. 2B and fig. S3). 
YpIII:(cIBX),,3 showed Yop expression, secretion, 
and translocation profiles similar to those of the 
wild-type strain (Fig. 2C and Fig. 3A). Whereas 
the YpIII:(cIBX),,-; strain showed a growth rate 
similar to that of the plasmid-cured YpIII (T3SS ) 
strain, the YpIII:(cIBX),,_3 strain showed reduced 
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growth at 37°C under T3SS-inductive conditions 
similar to that of the wild-type strain (fig. S5). 
YplIII:(cIBX),,_3 was virulent in an oral BALB/c 
mouse infection model (Fig. 3B). 

All infected mice developed a systemic infec- 
tion and were consequently killed. Bacterial loads 
2 days after infection were higher in homog- 
enized PPs from animals infected with virulent 
YpIII:(cIBX),,-3 than in those infected with avi- 
rulent YpIII:(cIBX),,_,; by two orders of magni- 
tude (fig. S4). The high- and low-copy genotypes 
of bacterial strains were retained after passage 
through the mouse. YpIII:(cIBX),-3 colonies 
recovered from PPs, MLNs, and cecum 2 days 
after infection showed variable numbers of vir- 
ulence plasmid equivalents (Fig. 3C). Remarkably, 
several isolated colonies possessed greater plas- 
mid equivalents than the original three, indi- 
cating that the integrated plasmid was under 
selective pressure for further amplification. Col- 
lectively, the observed gene dose-dependent vir- 
ulence of YpIII:(cIBX),,-; and YplIII:(cIBX),,.-3 
indicates that plasmid copy number amplifi- 
cation is essential for Yersinia virulence. 

To determine whether the plasmid copy num- 
ber increases during infection, we isolated total 
DNA from PPs of mice after infection with wild- 
type YpIII/pIBX. The copy number of pIBX was 
determined by whole-genome DNA sequencing 
and qPCR in the complex DNA sample. Both in- 
dependent methods showed a factor of 6 in- 
crease in virulence plasmid copy number in the 
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Fig. 4. Yersinia virulence plasmid copy number 
increases during infection in mice. (A) Virulence 
plasmid copy number of wild-type Yplll/pIBX in 
total DNA extracted from infected Peyer’s patches 
(PP) 48 hours after infection, determined by whole- 
genome sequencing (TruSeq) (table S1) and by 
qPCR. Plasmid copy number of infecting inoculum 
determined by qPCR is shown as control (N = 4). 
(B) Plasmid copy numbers of wild-type Yplll/pIBX 
in lysates from homogenates from indicated organs 
determined by qPCR. Plasmid copy number of in- 
fecting inoculum is shown as control (N = 3). (C) The 
ratio of copA:repA RNA decreases in Yersinia during 
infection. The copA:repA ratio was calculated as 
reads per kilobase of transcript per million mapped 
reads (RPKM) of copA and repA (table S2) from 
bacteria grown at 25°C (inoculum), 37°C, and in- 
fected PP (in vivo) (N = 3). Data are means + SEM; 
*P < 0.05, **P < 0.01, ***P < 0.001 (difference be- 
tween control and the respective conditions; un- 
paired t test). 


PPs relative to the infecting inoculum (Fig. 4A). 
Increased virulence plasmid copy number dur- 
ing infection was verified by qPCR of lysates 
prepared from animal tissues recovered at dif- 
ferent times after oral infections with wild-type 
YPIII/pIBX (Fig. 4B). We observed substantial 
increases in virulence plasmid copy numbers in 
all infected samples analyzed. 

Reduced amounts of copA antisense RNA 
relative to repA mRNA have been shown to lead 
to elevated RepA levels and thus to an increase 
in plasmid replication initiation (79). Analysis of 
copA and repA transcript levels in Yersinia during 
infection showed that the ratio of copA RNA to 
repA RNA decreased significantly during PP col- 
onization relative to bacteria grown in vitro (Fig. 
4C), corroborating the findings presented above. 
Our results show that T3SS-related copy num- 
ber regulation affects the stability or transcription 
of both RNA species (table S2 and fig. S6). Taking 
into account the actual gene dose of plasmid- 
encoded genes under the different conditions, 
we found that the major change was a marked 
decrease in copA levels while the repA mRNA 
levels per plasmid copy remained basically un- 
changed (table S2 and fig. S6). Such a finding 
favors a model in which T3SS-related copy number 
regulation operates via a change in copA antisense 
RNA levels. 

Our results show that an increased dose of 
plasmid-encoded genes is essential for Yersinia 
virulence. One copy of the virulence plasmid that 
encodes the T3SS regulon is insufficient to establish 
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a systemic infection in mice; thus, Yersinia has 
evolved a mechanism to rapidly increase the copy 
number during an infection. Our results also show 
that, despite the increased metabolic burden 
caused by three T3SS regulon copies, there was 
selective pressure during infection to increase 
the T3SS gene dose. Moreover, the secreted pro- 
tein, YopD, is involved in regulation of the copy 
number. This feature indicates that the system is 
tightly controlled, linking external sensors to 
T3SS activity and the plasmid copy number. 
In Yersinia, high T3SS activity is deleterious 
for growth in vitro. Yet T3SS expression is essen- 
tial for successful infection and proliferation in 
hosts. We found an inverse correlation between 
virulence plasmid gene dose and growth rate 
under T3SS-inductive conditions. Therefore, we 
propose that Yersinia maintains the entire T3SS 
virulence system in a plasmid, which enables 
rapid adjustments in T3SS expression in response 
to prevailing environmental conditions. This basic 
regulatory tactic is likely to apply to any plasmid- 
encoded genes by analogous mechanisms. Our 
findings provide a framework for further func- 
tional investigations of the regulatory pathways 
that allow pathogens to respond to environmen- 
tal cues, and plasmid copy number regulation may 
be important in bacterial antibiotic resistance. 
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EPIGENETICS 


Early-life nutrition modulates the 
epigenetic state of specific rDNA 
genetic variants in mice 


Michelle L. Holland,’*+ Robert Lowe,'* Paul W. Caton,” Carolina Gemma," 
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A suboptimal early-life environment, due to poor nutrition or stress during pregnancy, 
can influence lifelong phenotypes in the progeny. Epigenetic factors are thought to be 
key mediators of these effects. We show that protein restriction in mice from conception 
until weaning induces a linear correlation between growth restriction and DNA 
methylation at ribosomal DNA (rDNA). This epigenetic response remains into adulthood 
and is restricted to rDNA copies associated with a specific genetic variant within the 
promoter. Related effects are also found in models of maternal high-fat or obesogenic 
diets. Our work identifies environmentally induced epigenetic dynamics that are 
dependent on underlying genetic variation and establishes rDNA as a genomic target 


of nutritional insults. 


xposure to an adverse in utero environment 
can have a long-lasting influence on adult 
phenotypes in mammals, a process termed 
“developmental programming” (J, 2). Con- 
sequently, there is great interest in identi- 
fying the molecular mechanisms that underlie 
developmental programming, and, in this regard, 
modulation of the epigenome has emerged as a 
potentially key contributing factor (3, 4). 

To explore epigenetic mechanisms involved in 
developmental programming, we employed a 
maternal protein restriction model (5). Inbred 
C57BL/6J mice were mated, and GO females were 
assigned to either a protein-restricted diet (PR) 
(8% protein) or a control diet (C) (20% protein) 
(table S1) until their G1 offspring were weaned. 
Only male Gis were studied in detail (n = 146). 
From weaning onward, both G1-PR and GI1-C 
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mice were kept on a control diet until they were 
killed at 16 to 20 weeks. Consistent with previous 
work, G1-PR males were ~25% lighter than G1-C 
at weaning (5) (Fig. 1A) (P = 2 x 10°). PRs also 
displayed reduced spontaneous locomotor activ- 
ity (fig. S1) and reduced glucose-stimulated in- 
sulin secretion (fig. S2). 

Several studies have shown that developmen- 
tal programming can perturb DNA methylation 
profiles (7). We used reduced representation bi- 
sulfite sequencing (RRBS) to generate genome- 
scale, single-base resolution DNA methylomes 
for eight G1-PR and eight G1-C mice, initially 
focusing on sperm, because it can be isolated to a 
high degree of purity. After genome-wide correction, 
we identified a single 1916-base pair (bp) differ- 
entially methylated region (DMR) hypermethylated 
in G1-PR males that mapped to Rn45s on chro- 
mosome 17 (mm10) (table $2). Further analysis 
revealed that Rn45s displays 98% homology to the 
973- to 2883-bp region of the ribosomal DNA (rDNA) 
consensus (Fig. 1B). rDNA is excluded from genome 
assemblies because of its multicopy nature. We 
therefore remapped the RRBS data to the con- 
sensus sequence for mouse rDNA (BKO000964) 
and confirmed extensive hypermethylation in PR 
sperm across the entire promoter and coding re- 
gions (~13.5 kb) (Fig. 1B). Directly correlating 
weaning weight with rDNA methylation levels 
revealed that G1-PR displayed a significantly greater 
negative correlation between weaning weight and 
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Fig. 1. Maternal PR induces a correlation between 
rDNA methylation and weaning weight. (A) Weaning 
weight of G1-PR males (red; 62 individual mice from 17 
different litters) was reduced compared with G1-C (black; 
84 individual males from 20 different litters) (t test, P= 
2 x 10°© using litter means, and P < 2.2 x 10°© using 
individual mice). Small points represent individual mice; 
larger squares represent the mean of a given Gl litter. 
(B) RRBS analysis of rDNA in G1 sperm shows that PRs 
(n= 8) are hypermethylated compared to controls (n = 8). 
The line represents mean methylation, and points rep- 
resent individual mice. The rDNA schematic shows the 
rRNA subunits, transcriptional start site (TSS), external 
transcribed spacer (ETS), and internal transcribed spacer 
(ITS). The Rn45S regions identified in the initial RRBS 
analysis is 98% homologous to the region shaded blue. 
(C) The correlation coefficient (t) between weaning 
weight (ww) and DNA methylation across the rDNA. 
Highlighted are examples of a positive correlation (green), 
close to zero (purple), and negative (orange). CpG-133 is 
circled in blue. 
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Fig. 2. Diet-induced methylation dynamics are restricted to a specific genetic 
variant of rDNA. (A) BisPCR-seq amplicons were generated to simultaneously 
analyze methylation at CpG-133 (methylation indicated by black circle) and 
genetic variation at position -104 (A or C) (left panel). CpG-133 methylation 
levels in sperm for each genetic variant is shown for G1-C (black; n = 15), and 
GI-PR (red; n = 17). (B) In sperm, methylation levels at A-variant-associated 
CpG-133 sites (CpG-133%) and weaning weight are not correlated in G1-C 
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(black; n = 15, t = 0.20, P = 0.30) but are negatively correlated in G1-PR (red; n = 17, 
t= —043, P=0.017). (C) In liver, CpG-133° methylation levels and weaning weight 
are not correlated in G1-C (black; n = 26, t = -0.14, P = 0.32) but are negatively 
correlated in G1-PR (red; n = 24, t = -0.46, P = 0.0016). (D) In sperm, CpG-133" 
methylation levels are uncorrelated with the percentage of total rDNA copies with 
an A-variant (%A) in G1-C (black; n = 15, t = —0.07, P = 0.77) but are positively 
correlated in G1-PR (red; n = 17,1 = 071, P=19 x 10°). 
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Fig. 3. Functional consequences of altered rDNA dynamics. (A) pRNA is transcribed from early 
replicating rDNA copies (assumed to be unmethylated at CpG-133). Therefore, the percentage of 
PRNA reads that encode an A at position -104 [pRNA(%A), indicated in blue, right] should reflect the 
proportion of A-variant rDNA copies that are unmethylated at CpG-133 (%AYN) (B) pRNA(%A) 
positively correlates with %ALN in both G1-C (black) and G1-PR (red) liver (total, n = 23, t= 0.61, P= 
14 x 10°*). (C) %AYN is not correlated with the abundance of 45S-rRNA in liver of G1-C (black; n = 
14, t= 0.03, P = 0.91), but is positively correlated in liver of G1-PR (red; n = 12, t = 0.52, P = 0.021). 


DNA methylation compared with G1-C (Wilcoxon 
rank sum test; P < 2.2 x 10°?) (Fig. 1C). This cor- 
relation was not confounded by weight or age at 
death (fig. S3). 

In the C57BL/6J genome, rDNA is composed 
of hundreds of copies in large arrays on chromo- 
somes 12, 15, 18, and 19, but only a subset are 
actively transcribed (6). Silenced copies are meth- 
ylated at a CpG site located 133-bp upstream of 
the 45S-rRNA transcriptional start site (Fig. 1C), 
and this prevents binding of the transcription 
factor UBF (upstream binding factor) and assem- 
bly of RNA polymerase I (7). We therefore focused 
on CpG-133 in the rest of the study using high- 
throughput sequencing (>1000X coverage) of bi- 
sulfite polymerase chain reaction (PCR) amplicons 
(bisPCR-seq). BisPCR-seq analysis of the same 
samples profiled by RRBS revealed strong con- 
cordance between the two methods (fig. S4) (t = 
0.77, P =1x 10). 

As rDNA copies within a single genome are 
genetically polymorphic (8), we designed the bisPCR- 
seq amplicon targeting CpG-133 to simultaneously 
assay previously documented genetic variation at 
position -104 (C or A, Fig. 2A). (Note that this 
variant does not overlap a CpG site) (9). CpG-133 
methylation levels were substantially lower for 
the C-variant relative to the A-variant (Fig. 2A), 
and there was no interaction between C-variant- 
associated CpG-133 methylation and weaning weight 
in GI-PR or G1-C sperm (fig. $5). On the other hand, 
CpG-133 methylation levels of A-variant rDNA 
(which we denote as CpG-1334) were negatively 
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correlated with weaning weight (Fig. 2B) (t = -0.43, 
P = 0.017). Figure 2B incorporates additional males 
(nine G1-PR and seven G1-C from litters not rep- 
resented in the RRBS data), reinforcing the neg- 
ative correlation between weaning weight and 
total CpG-133 methylation observed in the RRBS 
data set. BisPCR-seq analysis of in vitro methylated 
samples confirmed that there was no amplification 
bias associated with either variant (fig. S6). We also 
confirmed sperm purity by analysis of several 
parentally imprinted regions (fig. S7). Analysis of 
liver using BisPCR-seq revealed a strong correla- 
tion with sperm within individual G1-C (fig. S8) 
(t = 0.72, P = 0.00028) or G1-PR animals (fig. S8) 
(t = 0.54, P = 0.0041). Liver CpG-133* methylation 
was negatively correlated with weaning weight 
in GI-PR (t = -0.46, n = 24, P = 0.0016) but not in 
GI1-C (n = 26) (Fig. 2C). Collectively, these data 
demonstrate that PR exposure induces not just 
rDNA hypermethylation but also a linear rela- 
tionship between a phenotypic outcome (weaning 
weight) and CpG-133“ methylation in sperm and 
liver, which is maintained into adulthood. 
Further exploration of the bisPCR-seq data re- 
vealed interindividual variation in the relative copy 
number of rDNA harboring the A-variant at posi- 
tion -104, even in an inbred genetic background. 
This underlying copy number variation (which we 
denote as %A, i.e., the percentage of A-variant 
reads relative to total coverage for this amplicon) 
was positively correlated between sperm and liver 
of both G1-C (fig. $9) (t = 0.77, P = 7 x 10°°) and 
GI-PR animals (fig. $9) (t = 0.73, P = 3.7 x 10°). 
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The accuracy of the bisPCR-seq-derived estimates 
of %A were confirmed by whole-genome rese- 
quencing of six mice (fig. S10) (t = 1, P = 0.0028). 
Furthermore, CpG-133* methylation correlated 
positively with %A in G1-PR sperm (Fig. 2D) (t = 
0.71, P=1.9 x 10”) and liver (fig. S11) (t = 0.31, P = 
0.034) but not in G1-C sperm (Fig. 2D) or liver 
(fig. S11). Therefore, early-life PR induces an in- 
terdependence between underlying variation in 
the relative abundance of a specific genetic var- 
iant of rDNA and methylation state of this var- 
iant at a functionally relevant CpG site. 

rDNA copies that lack methylation at CpG-133 
have the potential to be transcriptionally active 
(7). As most methylation is localized to A-variant 
rDNA, both the level of methylation at CpG-133* 
and the relative abundance of this variant (i.e., %A) 
will contribute toward transcriptional competency. 
This interaction can be represented as the percent- 
age of total rDNA copies that are both A-variant 
and unmethylated at CpG-133 (which we denote 
as %AUN), (Note that %AN is different from sim- 
ply considering the percentage of CpG-133" that 
is unmethylated.) As expected, %A” correlates 
between the sperm and liver of G1-C and G1-PR 
mice (fig. S12). To confirm the functional impor- 
tance of %AUN, we analyzed a regulatory non- 
coding RNA [promoter-associated RNA (pRNA)] 
that spans the rDNA promoter (Fig. 3A). pRNA is 
transcribed from early replicating and unmeth- 
ylated rDNA copies (10). It functions in trans to 
recruit nucleolar chromatin remodeling complex 
and DNA methyltransferase to silenced rDNA copies 
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Fig. 4. Maternal high-fat A 
or obesogenic diet 

induces hypermethyla- 

tion at CpG-133". 

(A) RRBS raw sequencing 70 
reads [obtained from (14)] 

were mapped to the rDNA 
consensus. GO dams that 60 
were fed either a low- (M‘) 

or high-fat (M") diet before 


80 


conception and up until the = 50 
Gls were weaned. Data ® 
shown here are from the 7 
livers of 9-week-old G1 2 40 


males that were placed ona 

low-fat diet from weaning 

up until they were killed. 30 
n=10 mice per group. We 

only reanalyzed data from 

(14) for the dietary groups 20 
analogous to the design of 

our PR model (B) GO dams 

were fed either a control 10 
(C) or obesogenic (O) diet 

6 weeks before conception 

and up until the Gls were 

weaned. bsPCR-seq data shown here are from the li 
control diet from weaning until they were killed (CC, 


M‘AL 


(1D. Using reverse transcription quantitative PCR 
(RT-qPCR), we generated a pRNA-derived amplicon 
spanning the genetic polymorphism at position -104 
and determined the percentage of A-variant reads 
after high-throughput sequencing [pRNA(%A)]. 
The pRNA(%A) reads in liver were consistently 
and positively correlated with %AN (Fig. 3B) but 
not %A (fig. S13). Therefore, %AWN is indicative of 
transcriptional competency at rDNA. 

The 45S-rRNA is cotranscriptionally cleaved at 
position +650 within the 5’ external transcribed 
spacer, and the first 650 nucleotides (nt) are then 
rapidly degraded (12). We assessed the abundance 
of the nascent, uncleaved 45S-rRNA precursor via 
RT-qPCR targeting the first 650 nt. In the liver of 
G1-C, 45S-rRNA abundance did not correlate with 
CpG-133" methylation, %A, or %AUN (Fig. 3C and 
fig. S14). In PR males, 45S-rRNA levels did not cor- 
relate with CpG-133" methylation or %A but cor- 
related positively with %AUN (Fig. 3C) (t = 0.52, 
P = 0.021) (fig. S14). Therefore, PR exposure induces 
a correlation between transcriptional competency 
and 45S-rRNA levels. 

Because rDNA expression is sensitive to nu- 
trient availability (73), the types of effects we 
describe could be a conserved feature of other 
nutritional developmental programming models. 
We identified a recent study in which the authors 
fed C57BL/6J GO females a low-fat (LF) or high-fat 
(HF) diet from 3 weeks before pregnancy until the 
male G1 offspring were weaned at 3 weeks onto a 
LF diet until they were killed at 9 weeks (74). Their 
RRBS analysis of G1 liver did not identify any 
maternal diet-induced DNA methylation differ- 
ences. We mapped their raw sequencing reads to 
rDNA and found that early-life exposure to HF 


mMiat cc oc 


vers of 6-month-old G1 males that were placed on a 
n=7;0C,n=8). 


(Fig. 4A) (P = 0.0098); again, CpG-133° showed 
lower methylation levels that were not affected 
by diet. Unfortunately, there were insufficient mice 
to examine correlations between %A and methyl- 
ation or weaning weight. Next, we generated 
bisPCR-seq data for G1 male C57BL/6J mice from 
a model of maternal obesogenic diet (15) (elevated 
fat and sugar content). GO females were fed either 
control or obesogenic diet 6 weeks before mating 
until the G1 offspring were weaned at 3 weeks onto 
a control diet and killed at 6 months. G1 males 
exposed in utero to obesogenic diet showed hyper- 
methylation at CpG-133“ (Fig. 4B) (P = 0.017). 

Recently, Shea et al. reported a study in which 
they exposed male C57BL/6J mice to one of three 
different diets (PR, HF, or caloric restriction) post- 
weaning (16). They identified substantial inter- 
individual genetic and methylomic variability at 
rDNA but no consistent diet-induced effects. Al- 
though part of the reason for the discrepant con- 
clusions could be that they did not discriminate 
between the A or C genetic variants, the more 
likely explanation is differences in developmental 
timing of the dietary insults because we analyzed 
exposures spanning only the period between con- 
ception and weaning. Previous human epidemio- 
logical and animal studies suggest that early life is 
a critical time when exposures can have long-term 
phenotypic effects on the offspring (77). 

In summary, we have described an example of 
a mammalian “epiallele” whose epigenetic state 
is influenced by an interaction between the under- 
lying genotype and early-life environment, and 
this correlates with transcriptional and pheno- 
typic outcomes. A schematic model of the effects 
we describe is presented in fig. S15. Our work, in 


induces CpG-133* hypermethylation in the Gis 
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combination with previous demonstrations in flies 


and yeast (18, 19), identifies rDNA as a genomic 
target of various nutritional insults that is con- 
served among nonmammalian and mammalian 
models. Exploration of such interactions at rDNA 
in humans could provide novel insights into the 
molecular basis of some complex phenotypes 
and diseases. 
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HUMAN GENETICS 


Early Neolithic genomes from the 
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We sequenced Early Neolithic genomes from the Zagros region of Iran (eastern Fertile 
Crescent), where some of the earliest evidence for farming is found, and identify a 
previously uncharacterized population that is neither ancestral to the first European 
farmers nor has contributed substantially to the ancestry of modern Europeans. These 
people are estimated to have separated from Early Neolithic farmers in Anatolia some 
46,000 to 77,000 years ago and show affinities to modern-day Pakistani and Afghan 
populations, but particularly to Iranian Zoroastrians. We conclude that multiple, 
genetically differentiated hunter-gatherer populations adopted farming in southwestern 
Asia, that components of pre-Neolithic population structure were preserved as 

farming spread into neighboring regions, and that the Zagros region was the cradle of 


eastward expansion. 


he earliest evidence for cultivation and 
stock-keeping is found in the Neolithic core 
zone of the Fertile Crescent (/, 2); a region 
stretching north from the southern Levant 
through eastern Anatolia and northern Mes- 
opotamia, then east into the Zagros Mountains on 
the border of modern-day Iran and Iraq (Fig. 1). 
From there, farming spread into surrounding 
regions, including Anatolia and, later, Europe, 
southern Asia, and parts of Arabia and North 
Africa. Whether the transition to agriculture was 
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a homogeneous process across the core zone, 
or a mosaic of localized domestications, is un- 
known. Likewise, the extent to which core zone 
farming populations were genetically homoge- 
neous, or exhibited structure that may have been 
preserved as agriculture spread into surrounding 
regions, is undetermined. 

Ancient DNA (aDNA) studies indicate that early 
Aegean farmers dating to ~6500 to 6000 BCE are 
the main ancestors of early European farmers 
(3, 4), although it is not known if they were pre- 
dominantly descended from core zone farming 
populations. We sequenced four Early Neolithic 
(EN) genomes from Zagros, Iran, including one 
to 10x mean coverage from a well-preserved male 
sample from the central Zagros site of Wezmeh 
Cave [WCI, 7455 to 7082 calibrated years (cal) 
BCE]. The three other individuals were from 
Tepe Abdul Hosein and were less well preserved 
(genome coverage between 0.6 and 1.2x) but are 
around 10,000 years old, and therefore are among 
the earliest Neolithic human remains in the world 
(tables S1 and $3). 

Despite a lack of a clear Neolithic context, 
the radiocarbon-inferred chronological age and 
palaeodietary data support WC1 being an early 
farmer (tables S1 to $3 and fig. S7). WC1 bone 
collagen 5C and 5N values are indistinguishable 
from those of a securely assigned Neolithic indi- 
vidual from Abdul Hosein and consistent with a 
diet rich in cultivated C3 cereals rather than ani- 
mal protein. Specifically, collagen from WC1 and 
Abdul Hosein is °C depleted compared to those 
from contemporaneous wild and domestic fauna 
from this region (5), which consumed Cy, plants. 
Crucially, WC1 and the Abdul Hosein farmers 
exhibit very similar genomic signatures. 


The four EN Zagros genomes form a distinct 
cluster in the first two dimensions of a principal 
components analysis (PCA; Fig. 2); they plot 
closest to modern-day Pakistanis and Afghans 
and are well separated from European hunter- 
gatherers (HG) and other Neolithic farmers. In 
an outgroup f3-test (6, 7) (figs. S17 to S20), all 
four Neolithic Iranian individuals are genetically 
more similar to each other than to any other pre- 
historic genome except a Chalcolithic genome 
from northwestern Anatolia (see below). Despite 
M4C dates spanning around 1200 years, these data 
are consistent with all four genomes being sam- 
pled from a single eastern Fertile Crescent EN 
population. 

Examination of runs of homozygosity (ROH) 
above 500 kb in length in WC1 demonstrated 
that he shared a similar ROH distribution with 
European and Aegean Neolithics, as well as 
modern-day Europeans (Fig. 3, A and B). How- 
ever, of all ancient samples considered, WC1 
displays the lowest total length of short ROH, 
suggesting that he was descended from a rel- 
atively large HG population. In contrast, the ROH 
distributions of the HG Kotias from Georgia, and 
Loschbour from Luxembourg, indicate prolonged 
periods of small ancestral population size (8). 

We also developed a method to estimate het- 
erozygosity (6) in 1-Mb windows that takes into 
account postmortem damage and is unbiased 
even at low coverage (9) (Fig. 3, C and D). The 
mean 6 in WC1 was higher than in HG indi- 
viduals (Bichon and Kotias), similar to that in 
Bronze Age individuals from Hungary and 
modern Europeans, and lower than in ancient 
(10) and modern Africans. Multidimensional 
scaling on a matrix of centered Spearman cor- 
relations of local 6 across the whole genome 
again puts WC1 closer to modern populations 
than to ancient foragers, indicating that both 
the mean and distribution of diversity over 
the genome are more similar to those of mod- 
ern populations (Fig. 3E). However, WC1 does 
have an excess of long ROH segments (>1.6 Mb), 
relative to Aegean and European Neolithics (Fig. 
3B). This includes several very long (7 to 16 Mb) 
ROH segments (Fig. 3A), confirmed by low 6 
estimates in those regions (Fig. 3C). These re- 
gions do not show reduced coverage in WC1 nor 
a reduction in diversity in other samples, with 
the exception of the longest such segment where 
we find reduced diversity in modern and HG 
individuals, although less extended than in WC1 
(7) (Fig. 3B). This observed excess of long seg- 
ments of reduced heterozygosity could be the 
result of cultural practices such as consanguinity 
and endogamy, or demographic constraints such 
as a recent or ongoing bottleneck (11). 

The extent of population genetic structure 
in Neolithic southwestern Asia has important 
implications for the origins of farming. High 
levels of structuring would be expected under 
a scenario of localized independent domesti- 
cation processes by distinct populations, whereas 
low structure would be more consistent with a 
single population origin of farming or a diffuse 
homogeneous domestication process, perhaps 
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involving high rates of gene flow across the en- 
tire Neolithic core zone. The ancient Zagros indi- 
viduals show stronger affinities to Caucasus HGs 
(table S17.1), whereas Neolithic Aegeans showed 
closer affinities to other European HGs (tables 
$17.2 and S17.3). Formal tests of admixture of 
the form f3(Neo_Iranian, HG; Anatolia_Neolithic) 
were all positive with Z-scores above 15.78 (table 
$17.6), indicating that Neolithic northwestern 
Anatolians did not descend from a population 
formed by the mixing of Zagros Neolithics and 
known HG groups. These results suggest that 
Neolithic populations from northwestern Anatolia 
and the Zagros descended from distinct ancestral 
populations. Furthermore, although the Caucasus 
HGs are genetically closest to EN Zagros individ- 
uals, they also share unique ancestry with east- 
ern, western, and Scandinavian European HGs 
(table $16.1), indicating that they are not the 
direct ancestors of Zagros Neolithics. 

The significant differences between ancient 
Iranians, Anatolian/European farmers, and Euro- 
pean HGs suggest a pre-Neolithic separation. 
Assuming a mutation rate of 5 x 10-"° per site 
per year (12), the inferred mean split time for 
Anatolian/European farmers (as represented by 
Bar8, 4) and European HGs (Loschbour) ranged 
from 33,000 to 39,000 years ago [combined 95% 
confidence interval (CI) 15,000 to 61,000 years 
ago], whereas the preceding divergence of the 
ancestors of Neolithic Iranians (WC1) occurred 
46,000 to 77,000 years ago (combined 95% CI 
38,000 to 104,000 years ago) (13) (fig. S48 and 
tables S34 and S35). Furthermore, the European 
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HGs were inferred to have an effective popula- 
tion size (Ne) that was ~10 to 20% of either Neo- 
lithic farming group, consistent with the ROH 
and 6 analyses. 

Levels of inferred Neanderthal ancestry in 
WC1 are low (fig. S22 and table S21), but fall 
within the general trend described recently in 
Fu et al. (14). Fu et al. (14) also inferred a basal 
Eurasian ancestry component in the Caucasus 
HG sample Satsurblia when examined within the 
context of a “base model” for various ancient Eur- 
asian genomes dated from ~45,000 to 7,000 years 
ago. We examined this base model using ADMIX- 
TUREGRAPH (6) and inferred almost twice as 
much basal Eurasian ancestry for WC1 as for 
Satsurblia (62 versus 32%) (fig. S52), with the 
remaining ancestry derived from a population 
most similar to ancient north Eurasians such as 
Mal “tal (15). Thus, Neolithic Iranians appear to 
derive predominantly from the earliest known 
Eurasian population branching event (7). 

“Chromosome painting” and an analysis of 
recent haplotype sharing using a Bayesian mix- 
ture model (7) revealed that, when compared to 
160 to 220 modern groups, WC1 shared a high 
proportion (>95%) of recent ancestry with indi- 
viduals from the Middle East, Caucasus, and 
India. We also compared WCT1’s haplotype-sharing 
profile to that of three high-coverage Neolithic 
genomes from northwestern Anatolia (Bar8; 
Barcin, Fig. 4), Germany (LBK; Stuttgart), and 
Hungary (NEI; Polgar-Ferenci-hat). Unlike WC1, 
these Anatolian and European Neolithics shared 
~60 to 100% of recent ancestry with modern 
groups sampled from southern Europe (figs. S24, 
$30, and S32 to S37; table S22). 

We also examined recent haplotype sharing 
between each modern group and ancient Neo- 
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lithic genomes from Iran (WC1) and Europe 
(LBK, NE1), HG genomes sampled from Lux- 
embourg (Loschbour) and the Caucasus (KK1; 
Kotias), a 4500-year-old genome from Ethiopia 
(Mota) and Ust’-Ishim, and a 45,000-year-old 
genome from Siberia. Modern groups from 
south, central, and northwestern Europe shared 
haplotypes predominantly with European Neo- 
lithic samples LBK and NEI, and European 
HGs, whereas modern Near and Middle Eastern, 
as well as southern Asian samples, had higher 
sharing with WCI (figs. S28 and S29). Modern 
Pakistani, Iranian, Armenian, Tajikistani, Uzbe- 
kistani, and Yemeni samples were inferred to 
share >10% of haplotypes with WC1. This was 
true even when modern groups from neighbor- 
ing geographic regions were added as poten- 
tial ancestry surrogates (figs. S26 and $27 and 
table $23). Iranian Zoroastrians had the highest 
inferred sharing with WC1 out of all modern 
groups (table S23). Consistent with this, out- 
group f3 statistics indicate that Iranian Zoro- 
astrians are the most genetically similar to all 
four Neolithic Iranians, followed by other mod- 
ern Iranians (Fars), Balochi (southeastern Iran, 
Pakistan, and Afghanistan), Brahui (Pakistan 
and Afghanistan), Kalash (Pakistan), and Geor- 
gians (figs. S12 to S15). Interestingly, WC1 most 
likely had brown eyes, relatively dark skin, and 
black hair, although Neolithic Iranians carried 
reduced pigmentation-associated alleles in sev- 
eral genes and derived alleles at 7 of the 12 loci 
showing the strongest signatures of selection 
in ancient Eurasians (3) (tables S29 to S33). 
Although there is a strong Neolithic compo- 
nent in these modern south Asian populations, 
simulation of allele sharing rejected full population 
continuity under plausible ancestral population 


PC1 (19%) 


sizes, indicating some population turnover in 
Iran since the Neolithic (7). 

While Early Neolithic samples from eastern 
and western southwest Asia differ conspicu- 
ously, comparisons to genomes from Chalcolithic 
Anatolia and Iron Age Iran indicate a degree of 
subsequent homogenization. Kumtepe6, a ~6750- 
year-old genome from northwestern Anatolia (16), 
was more similar to Neolithic Iranians than to 
any other non-Iranian ancient genome (figs. S17 
to $20 and table $18.1). Furthermore, our male 
Iron Age genome (F38; 971 to 832 BCE; sequenced 
to 1.9x) from Tepe Hasanlu in northwestern Iran 
shares greatest similarity with Kumtepe6 (fig. S21) 
even when compared to Neolithic Iranians (table 
$20). We inferred additional non-Iranian or non- 
Anatolian ancestry in F38 from sources such as 
European Neolithics and even post-Neolithic 
Steppe populations (table S20). Consistent with 
this, F38 carried a Nla subclade mitochondrial 
DNA (mtDNA), which is common in early Euro- 
pean and northwestern Anatolian farmers (3). In 
contrast, his Y chromosome belongs to subhap- 
logroup Ribla2a2, also found in five Yamnaya 
individuals (77) and in two individuals from the 
Poltavka culture (3). These patterns indicate that 
post-Neolithic homogenization in southwestern 
Asia involved substantial bidirectional gene flow 
between the east and west of the region, as well 
as possible gene flow from the Steppe. 

Migration of people associated with the Yam- 
naya culture has been implicated in the spread of 
Indo-European languages (17, 18), and some level 
of Near Eastern ancestry was previously inferred 
in southern Russian pre-Yamnaya populations 
(3). However, our analyses suggest that Neolithic 
Iranians were unlikely to be the main source of 
Near Eastern ancestry in the Steppe population 
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Fig. 4. Modern-day peoples with affinity to WC1. Modern groups with an increasingly higher (respectively lower) inferred proportion of haplotype sharing with 
the Iranian Neolithic Wezmeh Cave (WC1, 7455 to 7082 cal BCE, blue triangle) compared to the Anatolian Neolithic Barcin genome (Bar8; 6212 to 6030 cal BCE, 
red triangle) are depicted with an increasingly stronger blue or red color, respectively. Circle sizes illustrate the relative absolute proportion of this difference 
between WC1 versus Bar8. The key for the modern group labels is provided in table S24. 


(table S20) and that this ancestry in pre-Yamnaya 
populations originated primarily in the west of 
southwest Asia. 

We also inferred shared ancestry between 
Steppe and Hasanlu Iron Age genomes that was 
distinct from EN Iranians (table S20) (7). In 
addition, modern Middle Easterners and South 
Asians appear to possess mixed ancestry from 
ancient Iranian and Steppe populations (tables 
S19 and S20). However, Steppe-related ancestry 
may also have been acquired indirectly from 
other sources (7), and it is not clear if this is 
sufficient to explain the spread of Indo-European 
languages from a hypothesized Steppe homeland 
to the region where Indo-Iranian languages are 
spoken today. Yet the affinities of Zagros Neolithic 
individuals to modern populations of Pakistan, 
Afghanistan, Iran, and India is consistent with 
a spread of Indo-Iranian languages, or of Dravidian 
languages (which includes Brahui), from the 
Zagros into southern Asia, in association with 
farming (19). 

The Neolithic transition in southwest Asia 
involved the appearance of different domestic 
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species, particularly crops, in different parts of 
the Neolithic core zone, with no single center 
(20). Early evidence of plant cultivation and goat 
management between the 10th and the 8th mil- 
lennium BCE highlights the Zagros as a key re- 
gion in the Neolithization process (J). Given the 
evidence of domestic species movement from 
east to west across southwest Asia (27), it is sur- 
prising that EN human genomes from the Zagros 
are not closely related to those from northwest- 
ern Anatolia and Europe. Instead they represent 
a previously undescribed Neolithic population. 
Our data show that the chain of Neolithic mi- 
gration into Europe does not reach back to the 
eastern Fertile Crescent, also raising questions 
about whether intermediate populations in south- 
eastern and Central Anatolia form part of this 
expansion. Nevertheless, it seems probable that 
the Zagros region was the source of an eastern 
expansion of the southwestern Asian domestic 
plant and animal economy. Our inferred persist- 
ence of ancient Zagros genetic components in 
modern day south Asians lends weight to a strong 
demic component to this expansion. 
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Crystal structure of Zika virus 
NS2B-NS3 protease in complex 
with a boronate inhibitor 


Jian Lei,»” Guido Hansen,’ Christoph Nitsche,?* Christian D. Klein,” 
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The ongoing Zika virus (ZIKV) outbreak is linked to severe neurological disorders. ZIKV 
relies on its NS2B/NS3 protease for polyprotein processing; hence, this enzyme is an 
attractive drug target. The 2.7 angstrom crystal structure of ZIKV protease in complex with 
a peptidomimetic boronic acid inhibitor reveals a cyclic diester between the boronic acid 
and glycerol. The P2 4-aminomethylphenylalanine moiety of the inhibitor forms a 
salt-bridge with the nonconserved Asp®* of NS2B; ion-pairing between Asp® and the P2 
residue of the substrate likely accounts for the enzyme’s high catalytic efficiency. The 
unusual dimer of the ZIKV protease:inhibitor complex seen in the crystal may provide a 
model for assemblies formed at high local concentrations of protease at the 
endoplasmatic reticulum membrane, the site of polyprotein processing. 


reviously considered a rare and mild path- 
ogen for humans (J), Zika virus (ZIKV) 
infection has recently been found to be 
responsible for neurological disorders in 
a substantial portion of patients. The in- 
fection can trigger Guillain-Barré syndrome 
(2), and prenatal ZIKV infection is responsible 
for a dramatically increased number of micro- 
cephaly cases in fetuses and newborn children 
(3). The World Health Organization (WHO) re- 
cently declared the association of ZIKV infec- 
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tion with these neurological disorders a Public 
Health Emergency of International Concern (4). 
There are no vaccines or antiviral drugs avail- 
able for protection from or treatment of ZIKV 
infection. 

ZIKV is a member of the genus Flavivirus 
in the Flaviviridae family of RNA viruses. Its 
~10.7-kb single-stranded RNA genome of posi- 
tive polarity encodes a single polyprotein, which, 
by analogy to other flaviviruses, is assumed to 
be cleaved by host-cell proteases (signalase and 
furin) and the viral NS2B/NS3 protease into three 
structural (C, prM/M, and E) and seven non- 
structural proteins (NS1, NS2A, NS2B, NS3, NS4A, 
NS4B, and NSS) (fig. $1). Similar to other flavi- 
virus proteases, such as those of dengue virus 
(DENV) and West Nile virus (WNV), the mature 
form of ZIKV protease consists of the N-terminal 
domain of NS3, which carries the catalytic triad 
Ser’°-His’!-Asp”, and the membrane-bound 
NS2B (a sequence alignment is available in fig. 
$2). Crystallization of this complex has not been 


successful so far for any flavivirus protease, but it 
has been shown that a construct comprising ~40 
hydrophilic residues of NS2B and ~185 residues 
of NS3, covalently linked via a Gly4-Ser-Gly,, se- 
quence, displays strong peptidolytic activity (5). 
Crystal structures of the free form of this pro- 
tease construct (“NS2B-NS3""”) usually reveal 
an “open conformation” featuring a well-ordered 
NS3P"° core and a flexible NS2B part that shows 
only limited interaction with NS3”°, whereas in- 
hibitor (and presumably substrate) binding indu- 
ces a pronounced conformational change of NS2B 
yielding a more compact, “closed” form (6, 7). 

We expressed in Escherichia coli a DNA con- 
struct corresponding to the NS2B-NS3""°-coding 
region of the Brazilian ZIKV isolate BeH823339 
(GenBank accession number KU729217.2) (8). 
This construct codes for residues 49 to 95 of 
ZIKV NS2B, the C terminus of which is covalently 
linked via Gly4-Ser-Gly, to the N terminus of 
NS3 (residues 1 to 170). The recombinant enzyme 
obtained is a mixture of monomer, disulfide- 
linked dimer (here designated “SS-dimer”) and— 
to a lesser extent—higher oligomers (fig. $3). The 
double mutant Cys®*°Ser/Cys"*Ser leads to loss of 
the disulfide bond, which occurs between Cys'*? 
residues of different polypeptide chains, as re- 
vealed by our x-ray structure. The SS-dimer and 
the monomer obtained by reduction with tris(2- 
carboxyethyl)phosphine (TCEP) (fig. S3) as well 
as the Cys®°Ser/Cys'’Ser mutant of ZIKV NS2B- 
NS3?"° are hyperactive against the standard 
flavivirus protease substrate benzoyl-norleucine- 
lysine-lysine-arginine 7-amino-4-methylcoumarine 
(Bz-Nle-Lys-Lys-Arg-AMO), with a very low Michaelis 
constant (K,,) and a specific catalytic efficiency 
(Keat/ Km) More than 20 times higher than for the 
WNV enzyme (Table 1). 

In order to elucidate the molecular basis of 
this hyperactivity, and to provide a starting point 
for structure-based drug design efforts, we have 
crystallized ZIKV NS2B-NS3?"° in the closed 
form and determined its x-ray structure at 2.7 A 
resolution. Containing two molecules (“A” and “B”) 
per asymmetric unit of the crystal, the structure 
reveals the same chymotrypsin-like fold for the 
NS3?"° domain as seen previously for other 
flavivirus proteases, with the NS2B polypeptide 
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Fig. 1. Crystal structure of the ZIKV NS2B- 
NS3P'° monomer in complex with cn-716. (A) 
Overall structure of the complex. NS3°"° (light 
blue) and NS2B (purple) are shown as ribbons, 
with secondary-structure elements labeled. NS3°"° 
is made up by two B-barrels with strand orders 
Al-BI-Cl-al-DI-Ela-Elb-Fl_ and All-Blla-Bllb-Cll-DIl- 
Ella-Ellb-Fll. NS2B includes B-strands £1 to B4. The 
N- and C-termini of NS2B and NS3?"° are indicated 
by letters in italics and nonitalics/underlined, respec- 
tively. The inhibitor cn-716 is shown with carbon 
atoms in purple and boron in yellow. Residues of 
the catalytic triad are in dark blue. Asterisk denotes 
residues from NS2B. (B) The inhibitor cn-716 is 
embedded in the substrate-binding site of ZIKV 
NS2B-NS3°"° [same view as in (A)]. The surfaces 
of NS2B and NS3°° are yellow and purple, respec- 
tively. A Fops-Fealc difference density contoured 
at 2.50 is shown for cn-716. Lys from molecule B 
of the dimer interacts with the inhibitor and is indi- 
cated by underlined K54. (©) Chemical structure of 
cn-716. (D) Schematic drawing and (E) Fobs-Fealc 
difference density (2.50) for the cyclic diester and 
its environment in molecule A. (F) Difference den- 
sity (2.50) for the cyclic diester and its environment 
in molecule B. 


wrapped around the NS3’°. The interaction 
between the two is stabilized by hydrogen bonds 
between f-strands B1 and AI, 62 and BIla, as well 
as 83 and BIIb of NS2B and NS3?"°, respectively 
(Fig. 1A). The root mean square deviations be- 
tween the ZIKV NS2B-NS3""° complex and tetra- 
peptide aldehyde complexes of WNV and DENV-3 
proteases are 0.9 to 1.1 A [for main-chain atoms; 
Protein Data Bank (PDB) codes 2FP7 and 3UI1I 
(6, 9)]. The capped dipeptide boronic acid com- 
pound cn-716 (Fig. 1C) was used to obtain the 
closed conformation of the protease. We found 
this compound to reversibly inhibit ZIKV NS2B- 
NS3”"° with half maximal inhibitory concentration 
(C50) = 0.25 + 0.02 uM and inhibition constant 
(K;) = 0.040 + 0.006 uM (in the presence of 20% 
glycerol) (fig. S4). In the structure of the com- 
plex, the boron atom is covalently linked to the 
side-chain Oy of the catalytic Ser’? (Fig. 1, B and 
D to F). The structure also reveals that the 
boronic acid moiety forms a cyclic diester with 
glycerol, which was continuously present in our 
enzyme preparation during purification and crys- 
tallization, as well as cryoprotection of crystals. 
Boronic acids tend to form esters with diols and 
triols, especially if five- or six-membered rings 
can be formed (10). Our Fop5-Feaic (observed and 
calculated structure-factor amplitudes, respectively) 
difference density indicates that a six-membered 
ring has been formed by reaction of the boronic 
acid with the terminal hydroxyl groups of glycerol 
(Fig. 1, B and D to F). The six-membered ring fits 
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neatly into the SI’ pocket of the enzyme (Fig. 1B), 
a site so far rarely addressed by synthetic flavi- 
virus protease inhibitors. In the absence of gly- 
cerol, the ICs for the boronic acid inhibitor was 
nearly unchanged (0.20 + 0.02 1M), but through 
ester formation with larger, more hydrophobic 
diols or triols, a prodrug might be obtained that 
will traverse the cellular membrane more readily 
than will free boronic acid derivatives. 

Because of the ring closure, the tetrahedral 
geometry of the boron is somewhat distorted. 
In molecule A, the six-membered ring assumes a 
boat-like conformation, with the middle hydroxyl 
group (O02) of glycerol in an axial position and 
donating an intramolecular hydrogen bond to the 
main-chain oxygen of the P2 residue of the in- 
hibitor (Fig. 1E). In molecule B, the six-membered 
ring adopts a somewhat twisted half-chair con- 
formation, and the central hydroxyl group, also 
in an axial position, donates a hydrogen bond to 
the carbonyl oxygen of Val®° (Fig. 1F). In mole- 
cule A, the two ring oxygens (O1 and O3) accept 
H-bonds respectively from the amide of Gly’? 
(from the oxyanion hole) and from the catalytic 
His”, whereas the latter interaction is missing in 
molecule B. These differences between the two 
inhibitor molecules in the asymmetric unit of the 
crystal reflect the conformational variability of 
the cyclic boronates (10). 

The P1-Arg residue of cn-716 forms a salt-bridge 
with Asp”, a feature conserved in many flavivirus 
protease complexes. Most probably protonated, the 


amino group of the 4-aminomethylphenylalanyl 
residue in the P2 position forms a hydrogen 
bond with the main-chain oxygen of Ser®*™ and a 
salt-bridge with Asp*** of the NS2B polypeptide 
(Fig. 1B; residues of NS2B are denoted by an 
asterisk). Asp*** is Asn in WNV and Ser or Thr in 
DENV 1-4 NS2B-NS3””” (fig. $2), unable to form 
an ion-pair interaction with the P2 residue of the 
inhibitor or the substrate, Bz-Nle-Lys-Lys-Arg- 
AMC. The Asp***Asn mutation leads to an ap- 
proximately twofold increase of Kj, and a Keat/Km 
reduced by 55%, as compared with the wild-type 
(WT) enzyme (Table 1). The Asp residue in this 
position provides an at least partial explanation for 
the lower Kj, and hence the much higher /cat/Km 
of ZIKV protease as compared with the WNV 
and DENV enzymes (Table 1). DENV NS2B/NS3 
protease has been shown to counteract the type-I 
interferon response via digesting the stimulator 
of inferon genes (STING) in human dendritic cells 
(DCs) (17). Because ZIKV also permissively infects 
human DCs (72), we speculate that an increased 
catalytic activity of ZIKV NS2B/NS3""° could cause 
more efficient cleaveage of STING, leading to an 
enhanced suppression of the host innate immunity. 

In the crystal, ZIKV NS2B-NS3""° forms an 
unusual dimer with noncrystallographic, quasi- 
twofold symmetry (Fig. 2A) that has not been 
seen with other flavivirus proteases. This tight 
dimer has to be distinguished from the labile SS- 
dimers seen in solution. In the tight dimer, the 
substrate-binding sites of the two monomers, 
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Fig. 2. The tight, noncrystallographic 2:2 dimer of quasi-twofold symmetry formed by the ZIKV 
NS2B-NS3?":inhibitor complex in the asymmetric unit of the crystal. (A) Front view and back view. 
The surfaces of NS2B and NS3?"° of molecule A are shown in light blue and orange, respectively; those of 
molecule B are dark blue and beige. Labels of molecule B residues are underlined. Residues of NS2B are 
marked by an asterisk. Residues Leu*° (purple/green) and Leu*! at the tip of the AI-BI loop (Fig. 1A) forma 


hook, making hydrophobic contacts with the opposing monomer. The Cys!“° residues forming labile 


disulfide bonds in the SS-dimer are yellow and pink. (B) A slice through the interior of the dimer, showing 
the S135 side-chains (dark blue) covalently bound to the inhibitor molecules. The color code is the same as 
in (A). Inhibitor molecules are colored purple and red. A schematic illustration of the interactions across 
the dimer interface is provided in fig. S5. 


Table 1. Kinetic parameters of variants of ZIKV NS2B-NS3 protease, in comparison to a similar 
WNV NS2B-NS3?"° construct. Data are for the cleavage of the flavivirus protease substrate Bz-Nle- 
Lys-Lys-Arg-AMC. “Monomer (wt)” (details, including definition of “wt’, are provided in the supplementary 
materials), and “SS-dimer (wt)” indicate enzyme preparations corresponding to the monomer (in the 
presence of TCEP) and the SS-dimer fraction from gel permeation chromatography. The kinetic param- 
eters for the ZIKV protease with Asp*** replaced by Asn are also included. “WNV NS2B-NS3""° (wt)” is our 
recombinant preparation of the WNV protease. For comparison, the kea:/K, values for WNV and DENV-2 
NS2B-NS3?"° and with the substrate Bz-Nle-Lys-Arg-Arg-AMC reported in (6) are given. Dashes indicate 
“not reported.” All values in this table are obtained at pH 8.5. 


Protease Keat (S™) Km (uM) Keat/ Km (S-2 M7) 


ZIKV NS2B-NS3?'° 


2,440,000 + 215,000 


446+1.0 


fe) 
Bilas OS) 5,620,000 + 546,000 
e) 


28.8 + 0.5 


wt 87 +01 774 +36 112,000 + 5000 


30,000 + 7000 


along with the bound inhibitor, face each other 
(Fig. 2B). The dimer has openings at both sides, 
which upon some “breathing” would allow ac- 


the presence of the disulfide seems to be essen- 
tial for crystallization of the ZIKV NS2B-NS3”°, 
as we failed to obtain crystals of the Cys®°Ser/ 


cess of substrate to the two active sites located at 
the center (Fig. 2A). The tight dimer in the asym- 
metric unit of the crystal is connected to neigh- 
boring dimers through two labile disulfide bonds 
linking Cys’? of monomer A to the same residue 
of monomer B in a symmetry-related dimer, and 
vice versa, giving rise to disulfide-mediated poly- 
mers of tight dimers (Cys" is indicated in Fig. 
2A). This disulfide bond is responsible for the 
formation of the “SS-dimer” apparent in the 
SDS-polyacrylamide gel electrophoresis (fig. $3). 
After a few seconds of the crystal in the x-ray 
beam, this exposed disulfide appears to be re- 
duced as a consequence of irradiation, although 
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Cys'Ser variant. 

Formation of the tight dimer in the asym- 
metric unit buries ~1240 A? of the surface of each 
of the two monomers, and the shape comple- 
mentarity (Sc) index (13) is 0.64 [for a large set 
of well-characterized homodimeric proteins, the 
mean Sc was 0.69 + 0.07 (14)]. If we include the 
two inhibitor molecules in the calculation, ~1500 A? 
of molecular surface are buried per monomer. 
Both the large surface area buried and the shape 
complementarity indicate that dimer formation 
is likely of biological relevance. Although we failed 
to observe this dimer in solutions of the ZIKV 
NS2B-NS3?"° complex with cn-716 up to a con- 


centration of 133 uM, we detected it by means 
of electrospray ionization mass spectrometry in 
the presence but not in the absence of the boronic 
acid inhibitor (fig. S6). The structure suggests that 
the closed form of the enzyme has the potential 
of forming well-defined dimers at higher concen- 
trations as they may occur (and are perhaps 
promoted by the membrane-embedded parts of 
NS2B, which are lacking in the present structure) 
at the endoplasmic reticulum membrane, where 
polyprotein processing and viral replication take 
place. 

Peptide boronic acids have previously been 
tested as drugs, and the proteasome inhibitor 
bortezomib (Velcade) has been approved for the 
treatment of multiple myelomas (15). A tetrapeptide- 
boronic acid was reported as a potent inhibitor 
of the DENV-2 NS2B-NS3""° but not studied 
further (16). Peptide boronic acids are usually not 
cytotoxic to Huh7 cells, which is what we ob- 
served with compound cn-716 (fig. S7). The struc- 
ture presented here forms a good starting point 
for the design of more specific anti-ZIKV drugs. 
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STRUCTURAL BIOLOGY 


An atomic model of HIV-1 capsid-SP1 
reveals structures regulating 
assembly and maturation 


Florian K. M. Schur,” Martin Obr,”*? Wim J. H. Hagen,’ William Wan," 
Arjen J. Jakobi,’* Joanna M. Kirkpatrick,”* Carsten Sachse,’ 
Hans-Georg Krausslich,”* John A. G. Briggs””+ 


Immature HIV-1 assembles at and buds from the plasma membrane before proteolytic 
cleavage of the viral Gag polyprotein induces structural maturation. Maturation can be 
blocked by maturation inhibitors (MIs), thereby abolishing infectivity. The CA (capsid) and 
SP1 (spacer peptide 1) region of Gag is the key regulator of assembly and maturation and is 
the target of Mls. We applied optimized cryo-electron tomography and subtomogram 
averaging to resolve this region within assembled immature HIV-1 particles at 3.9 angstrom 
resolution and built an atomic model. The structure reveals a network of intra- and 
intermolecular interactions mediating immature HIV-1 assembly. The proteolytic cleavage 
site between CA and SP1 is inaccessible to protease. We suggest that Mls prevent CA-SP1 
cleavage by stabilizing the structure, and MI resistance develops by destabilizing CA-SP1. 


he major structural protein of HIV-1, Gag, 
oligomerizes at the plasma membrane of 
infected cells, leading to membrane bend- 
ing and release of immature virus particles. 


the immature virus has been derived by fitting 
nuclear magnetic resonance and crystal struc- 
tures of the folded N-terminal and C-terminal 
domains (CA-NTD and CA-CTD, respectively) into 


Ordered cleavage of the Gag polyprotein at 
five sites by the viral protease (PR) then causes a 
dramatic structural rearrangement of the virus 
to produce the mature, infectious virion (fig. $1, 
A to C). Gag-Gag interactions in the immature 
virus are mediated by the adjacent CA (capsid) 
domain and SP1 (spacer peptide 1). After ma- 
turation, CA forms a conical core encapsulating 
the viral RNA-nucleoprotein complex (1). The 
final proteolytic cleavage in Gag occurs between 
CA and SP1. Even small remnants of uncleaved 
CA-SP1 have a dominant-negative effect on in- 
fectivity (2, 3). Cleavage at this site is inhibited 
by maturation inhibitors (MIs) (fig. SIB) (4), and 
polymorphisms in this region cause resistance 
against the first-in-class MI bevirimat (BVM) 
(5). Other MIs inhibiting CA-SP1 cleavage are 
currently undergoing clinical trials, but the pre- 
cise mechanisms of their inhibitory action and 
of resistance against them are unclear. 
Unprocessed Gag is assembled into irregular 
curved lattices whose structures cannot be de- 
termined using conventional structural biology 
techniques. The best structural model for CA in 
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a structure of the immature CA lattice acquired 
through cryo-electron tomography and subtomo- 
gram averaging (6). Subtomogram averaging 


A Immature HIV-1 particles 
(D25A) 


Untreated AMACANCSP2 
VLPs 


can resolve protein structures within complex 
environments ranging from cells to enveloped 
viruses (7), but it has been limited to resolutions 
of ~8 A. The positions of a-helices are visible at 
this resolution, but those of amino acids are 
not. Furthermore, 8 A resolution is not suffi- 
cient to generate ab initio structural models for 
unknown structures such as the Gag regions 
downstream of the crystallized CA-CTD, which 
ends at residue 220 of the 231-amino acid CA 
domain (fig. S1D). This downstream region con- 
sists of a sequence of amino acids that is 
thought to assemble into a flexible hinge, fol- 
lowed by a sequence that is predicted to form a 
six-helix bundle spanning the C-terminal resi- 
dues of CA, the critical CA-SP1 cleavage site, 
and the N-terminal residues of SP1 (8). This 
region includes the majority of the amino acids 
that are essential for virus assembly (9-11). Ob- 
taining a high-resolution structure of CA-SP1 in 
the immature arrangement is vital for a mecha- 
nistic understanding of HIV-1 assembly, matu- 
ration, and inhibition. 

The HIV-1 Gag construct AMACANCSP2 (MA, 
matrix; NC, nucleocapsid) (17) assembles into 
immature virus-like particles (VLPs) in vitro (Fig. 
1A and fig. SIC). We assembled AMACANCSP2 
particles in the absence or presence of 100 ug/ml 
BVM. We also purified intact, immature HIV- 
1 virus particles carrying an inactivating PR mu- 
tation, D25A. For all three cases, we used an 
optimized data collection scheme to acquire 
cryo-electron tomography tilt series (table S1) 


AMACANCSP2 =-C 
VLPs + BVM 
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Fig. 1. Structure of the immature HIV-1 CA-SP1 lattice at 3.9 A. (A) Computational slices through 
tomograms of immature HIV-1 (D25A mutant), untreated AMACANCSP2 VLPs, and BVM-treated 
AMACANCSP2 VLPs. The arrowhead marks the membrane bilayer in the left panel. Scale bar, 50 nm. 
(B) Electron densities of CA-SP1 from the samples shown in (A), viewed perpendicular to the lattice, 
generated by subtomogram averaging. One CA-SP1 monomer is highlighted in color, with the CA- 
NTD in cyan and the CA-CTD and SP1 in orange. The resolutions of the determined structures are 
noted. (C) The refined atomic model, viewed from outside of the virus (top) and rotated by 90° 
shown in the same view as in (B) (bottom). The sixfold symmetry axis is indicated with a hexagon. 
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(12, 13). VLPs with or without BVM were indis- 
tinguishable, showing densities for the Gag lattice 
that were similar to those in the immature virus 
particles (Fig. 1A). Subtomogram averaging was 
performed independently for each data set, using 
an optimized workflow with features including 
frame-based motion correction (J4) and exposure 
filtering (15). The resolutions of the CA-SP1 layer 
in the final averages for untreated VLPs, BVM- 
treated VLPs, and immature virus particles were 
4.5, 3.9, and 4.2 A, respectively (Fig. 1B and fig. 
S2, A and B). 

We compared the three structures and found 
no clear differences in the protein densities of 
CA or SP1; they varied only in the presence or 
absence of densities at the center of the hexa- 
mer near SPI (fig. S2, C to I, and fig. S3C). Given 
the high degree of structural similarity between 
the maps, we used the 3.9 A structure to build 
and refine a complete atomic model for the 
CA-SP1 region from Gag residues 148 to 371 
(Figs. 1C and 2, figs. S3 and S4, table S2, and 
movies S1 and 82). Residues 356 to 371, cover- 
ing the C terminus of CA and the first eight 
residues of SP1, assemble into a six-helix bundle. 


Interactions in the CA-NTD layer are described 
in fig. S4A. The CA-NTD and CA-CTD contact 
one another in the region of E160 and E161 in 
the CA-NTD and Q308 in the CA-CTD. The rela- 
tive positions of the two domains may also be 
restricted by the extended rigid linker between 
them, which appears to be stabilized by an in- 
teraction with R305 in helix 8. Y277, the last resi- 
due of helix 7 in a solution structure (16) and in 
the mature-like CA hexamer (17), is rotated out of 
the helix and packs against the linker at P279 
(Fig. 2B). This network of interactions may allow 
the CA-CTD to be structurally modulated by cleav- 
age upstream of the CA-NTD, and vice versa. 

Highly conserved residues of the major ho- 
mology region (MHR) in the CA-CTD abut the 
extended chain between the 3,9-helix and helix 8 
(Fig. 2C). Charged residues in the 3,9-helix and 
extended chain interact with the neighboring 
CA-CTD molecule within the hexamer (Fig. 2, B 
and C). Hexamers are linked by a CA-CTD dimer 
interface (fig. S4B) formed by residues W316 and 
M317 (6, 18, 19). 

Downstream of helix 11, beyond the residues 
that are resolved in crystal structures (19, 20), 


the flexible hinge region (referred to here as 
the VGG hinge; residues 353 to 355) unexpectedly 
adopts a rigid structure within the lattice. P356 
then marks the start of a helix (referred to here 
as the CA-SP1 helix) that extends down to residue 
371 in SP1 (Fig. 2, A and D) before abruptly 
ending, indicating that residues C-terminal of 
371 are disordered (movie S1). The CA-SP1 
helix protrudes up into the CA-CTD layer, 
where the top of the helix and the VGG hinge 
from one Gag molecule pack tightly against 
the CA-CTD from the neighboring Gag mole- 
cule. The CA-CTD, VGG hinge, and CA-SP1 helix 
thus form a single, integrated assembly unit 
that defines the structure of the hexamer (Fig. 2, 
D and G). 

This assembly unit appears to be stabilized 
by a three-way interaction between H358 in the 
CA-SP1 helix, D329 in the base of helix 10, and 
P356 in the CA-SP1 helix of the neighboring CA 
molecule (Fig. 2E). K290 in the loop upstream 
of helix 8 and K359 in the CA-SP1 helix protrude 
from above and below these residues toward 
the center of the six-helix bundle, where they 
coordinate a density, presumably a negatively 


Fig. 2. Structural features in the immature CA-SP1 lattice. (A) A single 
CA-SP1 monomer, as in Fig. 1C. a-helices (H1 to H11 and Hea-spz) and the cyclo- 
philin A binding loop (CypA BL) are labeled (a-helices from neighboring CA 
monomers are shown in brackets). Colored rectangles indicate regions enlarged 
in (B) to (D). (B) The CA-NTD-CA-CTD linker is in an extended conformation, 
with Y277 binding to the linker and S278 approaching R305. (C) The highly 
conserved residues in the MHR (Q287, E291, and R299) stabilize the linker 
connecting the 3;9-helix and helix 8. Residues in this linker can interact with an 
adjacent CA monomer around the hexameric ring (e.g., R286 with E344 and 
D284 with R294); point mutations of these residues do not abolish assembly 
(22), suggesting some redundancy in these interactions. (D) The CA-CTD, the 
VGG hinge, and the top of the CA-SP1 helix form an integrated structural as- 
sembly unit. The CA-SP1 cleavage site is marked by a blue star. Dashed rect- 
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angles indicate the approximate positions of the regions enlarged in (E) and 
(F). Residues are colored as in (E) and (F). (E) Residues D329, P356, and H358 
(in purple) form a three-way linkage between two neighboring CA-SP1 helices 
and the base of the CA-CTD. (F) K290 and K359 (in green) protrude from 
above and below the region shown in (E) to the center of the hexamer, where 
they coordinate a strong density. (G) Horizontal (left) and vertical (right) 
slabs through the structure illustrate that the MHR (yellow), other residues in 
the CA-CTD base (red), the VGG hinge (blue), and the top of the CA-SP1 helix 
(pink) come together to form the hexameric assembly unit. In the vertical 
slab, one-half of the hexamer is represented in a surface view. Single-letter 
abbreviations for the amino acid residues are as follows: A, Ala; D, Asp; E, Glu; 
G, Gly; H, His [except when used to indicate helices, as in (A) to (D)]; K, Lys; L, 
Leu; M, Met; P, Pro; Q, Gln; R, Arg; S, Ser; V, Val; W, Trp; and Y, Tyr. 
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Fig. 3. Mutations that confer resistance to 
Mls destabilize the immature lattice. Resist- 
ance mutations or naturally occurring polymor- 
phisms that render HIV-1 resistant to BVM (red) 
or PF-46396 (green) have been mapped onto the 
atomic model. Mutations that confer resistance 
for both compounds are colored yellow. The CA- 
SP1 cleavage site is marked by blue stars. For 
clarity, only three helices of the six-helix bundle 
are displayed. 


charged ion cluster (Fig. 2F and movie S2). 
This arrangement is reminiscent of the six ar- 
ginine residues that protrude into the center 
of the NTD hexamer in mature HIV CA (7, 21). 
Twelve essential amino residues have been iden- 
tified in the CA-CTD, where mutation to alanine 
abolishes virus assembly (22, 23). These include 
W216 and M317 in the hydrophobic dimeric in- 
terface; V353, G354, and G355 in the VGG hinge; 
K290, D329, P356, H358, and K359, which to- 
gether form the intricate network of interac- 
tions that defines the assembly unit; and A360 
and L363, which appear to have hydrophobic 
interactions within the six-helix bundle. There 
is therefore a close correlation between the sen- 
sitivity of a residue to mutation and whether it 
has a role in mediating interactions in the CA- 
SP1 assembly unit, confirming the importance 
of these interactions in virus assembly. 

During maturation, the final proteolytic cleav- 
age occurs between L363 and A364. In our 
structure, this site is in the middle of the CA- 
SP1 helix bundle, where it is inaccessible to PR, 
which acts on extended protein chains (24) 
(Fig. 2D and movie S2). Disassembly of the 
immature lattice and full cleavage between CA 
and SP1 only take place once cleavage has oc- 
curred both between MA and CA and between 
SP1 and NC (10, 25). Together, these observa- 
tions support a model in which the final step in 
maturation is regulated by limiting the access of 
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PR to its substrate: Cleavage upstream of CA 
and downstream of SP1 together destabilize the 
immature lattice and the CA-SP1 helix, thereby 
allowing PR to cleave between CA and SPI. 

MIs block HIV infection by preventing cleav- 
age at the CA-SPI1 site (4, 13, 26) (fig. S5) and 
may also stabilize the immature lattice (27, 28). 
We mapped mutations, deletions, and polymor- 
phisms that confer resistance to BVM (5, 29, 30) 
and PF-46396 [another MI (31, 32)] onto our 
structural model (Fig. 3). This revealed that the 
positions of resistance mutations do not map 
out potential drug-binding pockets. Instead, they 
are located at protein-protein interfaces within 
the CA-SP1 lattice. Together with our observation 
that the CA-SP1 cleavage site is inaccessible in the 
immature virus, this implies that the mode of 
action of MIs is not steric inhibition of proteolysis 
but, instead, stabilization of the immature Gag 
lattice. Our data suggest that BVM stabilizes 
the lattice by binding to a site in the center of the 
six-helix bundle (fig. S2). The reduced cleavage 
at the CA-SP1 boundary is a downstream effect 
of stabilization because cleavage of this site requires 
unfolding to an extended chain. HIV-1 appears 
to develop MI resistance by destabilizing its im- 
mature form, thus directly counteracting the sta- 
bilizing effects of MIs. 

Note added in proof: In a concurrent publica- 
tion, Wagner e¢ al. report a crystal structure of a 
CA-CTD-SP1 construct that adopts a similar con- 
formation to the structure reported here (33). They 
also report that the CA-SP1 cleavage site is within 
a six-helix bundle and suggest that MIs prevent 
proteolytic cleavage by stabilizing this structure. 
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MERLIN. Raman microscopy, a label- 
free, nondestructive technology for 

the identification and imaging of the 
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energy-dispersive X-ray spectroscopy 
that can only identify elemental constitu- 
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driven, push-button mechanism us- 
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The CometChip high-throughput GenTox 
platform allows a 5,000% increase in 
throughput over the original comet 

assay procedure. The comet assay is 

a highly sensitive, direct approach for 

the measurement of DNA damage and 
repair. However, adoption and routine use 
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The Department of Regenerative Medicine and Cell Biology at the Medical University of South Carolina 
is seeking a leader to fill the Smartstate Endowed Chair in Biofabrication Biology at the rank of Professor 
or Associate Professor. This position is part of the South Carolina Advanced Tissue Biofabrication Center 
of Economic Excellence (http://smartstatesc.org/advanced-tissue-biofabrication). A focus of the center is to 
integrate basic cell biology with biofabrication to generate tissues in the laboratory that can be used to advance 
our understanding and treatment of human disease. The successful candidate will have a strong background 
in cell or developmental biology and will integrate the use of bioengineering or tissue fabrication into their 
research program. Candidates interested in the study of vascular, cardiovascular, or digestive disease are 
particularly encouraged to apply. 


Candidates should have a Ph.D. or MD/Ph.D. degree (or equivalent) with credentials that are compatible 
with granting tenure. In addition, candidates should have extensive experience in managing a successful 
research laboratory, an outstanding and internationally recognized record of high impact publications, and a 
compelling history of extramural funding. Current research strengths in the department include pluripotent 
stem cell differentiation, cardiovascular development, digestive disease, and molecular biology of the cell. 
Research at MUSC is supported by several excellent core facilities specializing in imaging, genetically 
modified mice and rats, drug discovery, proteomics, genomics, and flow cytometry. 


MUSC in Charleston was established in 1824 and is ranked in the upper quartile of all freestanding academic 
medical universities. The university has approximately 2,200 graduate and professional students that are 
supported by 2,000 faculty members. In 2014 total financial support topped $217 million with $100 million 
coming from the NIH. This includes funding that supports the NCI-designated Hollings Cancer Center as 
well as a Clinical and Translational Science Award. 


MUSC is an Affirmative Action/Equal Opportunity Employer that strives to hire without regard to gender, race, 
color, national origin, sex, age, religion, disability, sexual orientation, genetic information or veteran status. 
MUSC takes affirmative action to hire and advance women, minorities, protected veterans and individuals 
with disabilities. MUSC has received a 4-year NSF ADVANCE grant to promote the recruitment, retention 
and advancement of women in science. 


Applicants should provide curriculum vitae, research plan, and the names of three references through the 
MUSC employment portal: http://careers.pageuppeople.com/756/cw/en-us/job/495349/univ-associate- 
professorprofessor-regenerative-medicine 
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Lecturer/Assistant Lecturer in Common Core Curriculum 


(Ref.: 201600883) 


Applications are invited for appointment as Lecturer/Assistant Lecturer in the Common Core 
Curriculum in the Faculty of Science, from as soon as possible, on a two-year fixed-term basis, 


with the possibility of renewal. 


The University has established a Common Core Curriculum for all undergraduates since September 
2012. The Faculty of Science offers courses in the area of Scientific and Technology Literacy to 


Assistant/Associate Professor 
Dept. of Cell & Developmental Biology 


The Department of Cell & Developmental Biology at SUNY Upstate 
Medical University in Syracuse, New York invites applications for a 
tenure-track position at the Assistant/Associate Professor level. 
Candidates must have a PhD or equivalent degree and postdoctoral 
experience in cell biology or a related field. Applicants with interests in all 
areas of cellular biology are encouraged to apply. 

Candidates at the Associate Professor level should have an established 
track record of funding and research productivity. Successful candidates 
are expected to develop and maintain a vigorous extramurally funded 


students from all ten Faculties. A senior Professor in the Faculty of Science will lea Lae research program and participate in the education of both medical and 


of these courses and work with the Lecturer/Assistant Lecturer in the relevant area who wi 
serve primarily as a tutor for small groups of students from multiple disciplines. The medium 
of instruction is English. Information about the Common Core Curricu 


http://tl.hku.hk/common-core-curriculum. 


Applicants should have a Ph.D. degree in Chemistry, Physics or Biological Sciences, and enthusiasm 


graduate students. 


The Department provides a strong, collaborative research community with 
interests in developmental and disease model systems, signaling, cell 
motility and the cytoskeleton explored in diverse model systems. We offer 
excellent departmental resources to support faculty including state of the 


um can be viewed at 


for and commitment to teaching. For Lecturer, at least 3 years pol experience at the 
university level is required. The appointee’s duties include conducting small-group tutorials for 


two courses per semester, marking assignments/examination scripts, pepany teaching materials, 


supervising student work, developing new courses, and undertaking other tasks related to science 
education. He/She will meet regularly with the Centre for the Enhancement of Teaching and Learning 
to reflect on and refine the teaching activities in Common Core courses. 


Aglobally competitive remuneration package commensurate with qualifications and experience will 
be offered. At current rates, salaries tax does not exceed 15% of gross income. The appointments 
will attract a contract-end gratuity and University contribution to a retirement benefits scheme, 
totalling up to 15% Leal) 0% (Assistant Lecturer) of basic salary, as well as annual leave, and 
medical benefits. Please note that the University is not able to offer a relocation assistance package 
(including temporary University housing accommodation and a passage and baggage allowance) 
to the successful candidate recruited from overseas. 


Enquiries about the specific job aeola should be sent to Professor Pauline Chiu, icaal of 
Science, e-mail: sciappt@hku.hk. Applicants should send a completed cial form, together 
with an up-to-date C.V. anda statement on roi philosophy, which includes a portfolio of syllabi 
and descriptions of courses they have taught or co-taught by e-mail to sciappt@hku.hk. They should 
also arrange for submission of three references from senior academics who are familiar with their 
teaching approaches, skills and experience to sciappt@hku.hk. Please indicate clearly at which level 
they wish to be considered for and the reference number in the subject of the e-mail. Application 
forms (341/111 4 can be downloaded at PEL tunit/form-ext.doc. Further particulars 
can be obtained at http://jobs.hku.hk/, Closes August 19, 2016. 


The ee thanks applicants for their interest, but advises that only candidates shortlisted for 
interviews will be notified of the application result. 


The University is an equal opportunities employer and is committed to a Non-Smoking Policy 
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art live cell imaging facilities, newly-renovated space, competitive salaries 
and startup packages, in additional to outstanding institutional core 
facilities. Syracuse is located in central NY and provides a diverse, 
dynamic and affordable metropolitan environment with easy access to the 
outstanding recreational opportunities of the Adirondack Mountains and 
the Finger Lakes, while the proximity of SUNY Upstate to Syracuse 
University and SUNY ESF campuses fosters productive scientific 
interactions and provides unique collaborative funding opportunities. To 
learn more go to http://www.upstate.edu/cdb/. 


Please submit a CV and a research statement describing past 
accomplishments and future plans, as a single PDF file, and arrange to 
have three letters of recommendation sent to: ReeshM@upstate.edu. 


Review of applications will begin October | and continue until the position 


is filled. U PSTATE 


MP DAD UA ER SITY 


At SUNY Upstate Medical University we strive to promote a professional environment 
that encourages varied perspectives from faculty members with diverse life 
experiences. A respect for diversity is one of our core values. We are committed to 
recruiting and supporting a rich community of outstanding faculty, staff and students. 
We actively seek applications from women and members of underrepresented groups 
to contribute to the diversity of our university community in support of our teaching, 
research and clinical missions. 
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For recruitment in science, there’s only one Science. 


Looking to fill your postdoc positions? Deliver your message toa 
global audience of targeted, 


Scientists will be eager to learn about non-faculty career options qualified scientists. 


in academia in STEM. What do these jobs involve and how can 

scientists prepare for them? What are the entry points and skills 

needed for success? Postdocs who have moved into these types 129,574 
of careers will share their success stories. Reach these readers 
and share opportunities at your university or company. 


subscribers in print 
every week 


352,966 


‘ ; ae 
What makes Science the best choice for recruiting? monthiieuntquéibrowsets 


" Read and respected by 400,000 readers around the globe on ScienceCareers.org 
= 78% of readers read Science more often than any other journal 
; : oa 65% 
= Your ad dollars support AAAS and its programs, which strengthens 
the global scientific community. Sie eae readers 


Why choose this Postdoc Feature for your advertisement? 


= Relevant ads lead off the career section with a special 
“Postdoc” banner. 


Nig ee 
Expand your exposure by posting your print ad online: 

= Link on the job board homepage directly to postdoc positions 
= Dedicated landing page for postdoc positions. 


Produced by the Science/AAAS Custom Publishing Office. 


SCIENCECAREERS.ORG 


To book your ad: advertise@sciencecareers.org 


e 
Science Careers tere: alee 


Japan China/Korea/Singapore/Taiwan 
FROM THE JOURNAL SCIENCE JMAVAAAS +813 3219 5777 +86 186 0082 9345 
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Download Free Career 
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ANNOUNCEMENTS 


NOTICE 
Cancellation of BoyaLife/Science Prize 
Due to unforeseen circumstances the BoyaLife/Science 
prize has been cancelled. Science/AAAS is no longer 
affiliated with any prize/award that is associated with 
BoyaLife. If you have any questions, please e-mail your 
inquires to scienceadvertising@aaas.org 
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RESEARCH FELLOWSHIP IN 
PRECISION MEDICINE 


Columbia University’s Irving Institute for 
Clinical and Translational Research is accepting 
applications for a two-year Precision Medicine 
Fellowship with the goal of training physicians/ 
researchers to use genomics and complex 
clinical data to improve clinical care and clinical 
outcomes by tailoring prevention, screening, 
and medical interventions to individual patient 
characteristics. 


Award Amount: $200,000 ($100,000 per year). 
Quantity: Up to 2 per year. 


Eligibility: Applicants must have a Ph.D., M.D., 
D.D.S., or comparable doctoral degree from an 
accredited domestic or foreign institution. 


Requirements: The applicant will need to 
identify a faculty mentor at Columbia University 
and propose a research project in precision 
medicine to be started on July 1, 2017. 


Application deadline: 5 pm EDT, September 
30, 2016. 


Download application form and instructions 
at http://www. irvinginstitute.columbia.edu/ 
resources/precision_med.html 


Columbia University is an Equal Opportunity/ 
Affirmative Action Employer. 


CALL FOR PAPERS! 
DOES YOUR LAB ANALYZE 
THE MECHANISMS THAT 
MEDIATE COMMUNICATION 


BETLWEEN.CEBLS? 
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Stay on top of the 

latest advances in brain 
development and neurological 
disorders with Science 
Signaling, the leading online 
journal of cross-disciplinary 
cell signaling research. The 
journal's high-impact articles 
showcase basic research 


related to cellular and 
organismal regulation relevant 
to development, physiology, 
and disease as well as applied 
signaling research important 
for drug discovery and 
synthetic biology. 


Learn more and submit 
your research today: 
ScienceSignaling.org 


MVAAAS 


Signaling 


CELL SIGNALING IN PHYSIOLOGY AND DISEASE 
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Growth from failure 


ur task as scientists is to ask difficult questions until we uncover the truth—which means 
learning from failed experiments and trying again, undaunted. Yet, after years of honing my 
type A tendencies to succeed academically, I began graduate school with a fear of failure. As I 
searched for the truths related to my project, I struggled with failing at the bench and became 
frustrated and defensive. I have only recently come to respect the critical role of failure in 
research, and I have learned to be open when I encounter it. 


In the beginning of my fifth year of 
grad school, I hit a rough patch in my 
studies—the most recent in a series 
of setbacks—and was not navigating 
the situation well. I had switched to 
a new project in the spring of my 
fourth year, but it was progressing 
slowly and hitting roadblocks. After 
1513 days as a Ph.D. student, I could 
not generate the materials that I 
needed for my experiments. The 
RNA I was trying to extract from 
cancer cells was degrading before I 
could analyze it, and the virus that I 
once could propagate would not co- 
operate. I was unsure when I would 
publish my work and _ graduate, 
and I was feeling purposeless after 
4 years of little progress. 

But instead of asking my labmates 
and principal investigator (PI) for 
help, I adopted a defensive attitude, 
becoming closed to suggestions and hiding the extent of my 
difficulties. I feared that if I acknowledged my lack of prog- 
ress, I would lose the respect of my PI and colleagues. At the 
same time, I berated myself for continual failure. 

One Friday afternoon, my PI approached me about my 
research struggles. It was a challenging conversation. She 
asked how she could help, but I did not know what to tell 
her. Overwhelmed, I asked whether we could meet again on 
Monday morning. She agreed. 

Coping with my shame and the criticism from my PI 
was difficult. After 24 hours of feeling overwhelmed with 
embarrassment and frustration, I finally reached out to my 
mom. She urged me to regain my mental focus by tackling 
a 1000-piece jigsaw puzzle over the weekend—which I now 
see as a corny but honest metaphor for piecing myself back 
together. Having a concrete objective helped me turn my 
thoughts outward in a productive way, toward what I could 
do to move forward. Midway through the puzzle, I decided 
to address my PI’s concerns by making an outline of my 
current progress and a definitive plan for the next month. 


“My role as a grad student 
is to learn from others—and 
from my mistakes.” 


On Monday morning, I presented 
the plan to my PI. I emphasized 
my commitment to continuing my 
research in spite of the failures I 
had experienced. In response, she 
implored me to ask more often for 
help on technical matters. When I 
left her office, I knew that a long 
path of hard work lay ahead, not 
only to produce results but to over- 
come my fear of failure. I promised 
myself that I would start communi- 
cating my struggles to others, ask- 
ing for help more—and accepting it 
when it is offered. 

Now, at the end of my fifth year, 
my research is progressing. Fol- 
lowing a recommendation from 
a labmate, I overcame my RNA 
degradation problem by wearing 
smaller gloves, which reduces con- 
tamination. I am also working on a 
new approach to reconstituting the virus. 

Looking back at that meeting with my PI, I am thankful 
that she approached me privately, in a direct and construc- 
tive manner. Even though her feedback was difficult to pro- 
cess at the time, I learned from that experience to be more 
open with my colleagues about my research setbacks. I have 
found that open discussion helps me organize my thoughts, 
forcing me to face my obstacles. It’s still a challenge, but 
one thing that helps is to remind myself that my colleagues 
and PI only want to help, and that my role as a grad student 
is to learn from others—and from my mistakes. 

Being a scientist requires resilience, and it always will. 
The open, supportive lab atmosphere my PI fosters has 
helped me cope with my fear of failure and become a more 
mature scientist. Quoting a friend, I now strive to be “type 
lowercase-a” instead of type A. 


Tenaya K. Vallery is a Ph.D. candidate in the Department of 
Molecular Biophysics and Biochemistry at Yale University. 
Send your story to SciCareerEditor@aaas.org. 
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